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Description:. 

Background of the Invention 

[0001] The estimated 50,000-100,000 genes scat- 5 
tered along the'hurrian x^cknosornes ofler tremendous 
promise for the understanding, diagnosis, and treatment 
of human diseases. In addition, probes capable of spe- 
cifically hybridizing to loci distributed throughout the hu- 
man genome find applications in the construction of high w 
resolution chromosome maps and in the identification 
of individuals. 

[0002] In the past, the characterization of even a sin- 
gle human gene was a painstaking process, requiring 
years of effort Recent developments in the areas of *s ". 
cloning vectors^ DNA seb^endng, and computer tech- 
nology have merged to greatly accelerate the rate at 
which human genes can be isolated, sequenced, 
mapped, and characterized. Cloning vectors such as 
yeast artificial chromosomes (YACs) and bacteria! arti- 20 
fictal chromosomes (BACs) are able to accept DNA in- 
serts ranging from 300 to 1000 kilobases (kb);or 
100-400 Mb in length respectively, thereby facilitating the 
manipulation arid ordering of ON A sequences olstrftjut- 
ed over great distances on the human chromosomes. *s 
Automated DNA sequencing machines permit the rapid 
sequencing of human genes. Bioinformatics software 
enables the comparison of nucleic acid and protein se- 
quences, thereby assisting in the characterization of hu- 
man gene products. > ; 
[0003] Currently, two different approaches are being 
pursued for identifying and [ cf^^acteriz^t|)e.]^es .^js^ r 
trfouted along the human genome. In one approach, 
large fragments gerKxruc DNA are isolated, cloned, t 
and sequenced. Potential open reading f rames in these 
genomic sequences are identified using biotnformattcs 
software. However, this approach entails sequencing 
large stretches of human DNA which do not encode pro- 
terns in order to find the protein encoding sequences 
scattered throughout the genome. In addition to requir- *o 
ing extensive sequencing, the biomformatics software 
may rnischaracterize the genomic sequences obtained. 
Thus, the software may produce false positives in which 
non-coding DNA is mischaracterized as coding DNA or 
false negatives in which coding DNA is mislabeled as 45 
non-coding DNA. 

[0004] An alternative approach takes a more direct 
route to identifying and characterizing human genes. In 
this approach, complementary DNAs (cDNAsj are syn- 
thesized from isolated messenger RNAs (mRNAs) so 
which encode human proteins. Using this approach, se- 
quencing is only performed on DNA which is derived 
from protein coding portions of the genome. Often, only 
short stretches of the cDNAs are sequenced to obtain 
sequences called expressed sequence tags (ESTs). 55 
The ESTs may then be used to isolate or purify extended 
cDNAs which include sequences adjacent to the EST 
sequences. The extended cDNAs may contain aH of the 



sequence of the EST which was used to obtain them or 
only a portion of the sequence of the EST which was 
used to obtain them. In addition, the extended cDNAs 
may contain the fufl coding sequence of the gene from 
which the EST was derived or, alternatively, the extend- 
ed cDNAs may include portions of me coding sequence 
of the gene from which the EST - was derived. It will be 
appreciated that there may be several extended cDNAs 
which include the EST sequence as a result of alternate 
splicing or the activity of alternative promoters. Alterna- 
tively, ESTs having partially cvertapping sequences may 
be identified and contigs comprising the consensus se- 
quences of the overlapping ESTs may be identified. 
[0005] In the past, these stiort EST sequences were 
often obtained from oligo-dT primed cDNA Bbraries. Ac- 
cordingly, they mainly corresponded to foe 3* untrans- 
lated region of the mRN A. In part, the prevalence of EST 
sequences derived from the 3* end of the mRNA is a 
result of the fact that typical techniques for obtaining cD- ; 
NAs, are not well suited for isolating cDNA sequences 
derived from the 5* ends of mRNAs. (Adams et at. Na- 
ture 377:3-174, 1996, Hillier et al., Genome Res. 6; 
807-828, 1996). 

[0006] In addition, in those reported instances where \ 
longer cDNA sequences have been obtained, the re- 
ported sequences typically correspond to coding se- 
quences and do not include the full 5' untranslated re- 
gion (5 f UTR) of the mRNA from which the cDNA is de- 
rived. 5'UTRs are often involved in the regulation of . .. 
gene expression,; by affecting either the stability or 
translation of mRNAs. Indeed, 5'UTRs may contain sev- 
eral features known to affect the initiation of translation; . 
(i) the distance between the cap structure |aftf frejnffi-; 
ation ccdon, (ii) the presence ^ cis-a^ng elements 
which may be either^^ear seq^errces such as polypy- . 
rimidine tracts (Kaspar ef al, J. BtoL Chem. 267, 
508-514, 1992; Severson et al, Eur J Biochem 229: 
426-32, 1995) or secondary structures such as IREs 
(Rouault and Klausner, Curr Top Cell Regul 35:1-19, 
1 997), and (iii) upstream open reading frames or uORFs 
(Gebatle and Morris, Trends Btochem Scf 19:159-64, 
1994). Thus, regulation of gene expression may be 
achieved through the use of alternative 5'UTRs. For in- 
stance, the translation of the tissue anhfottor of metallo- 
protease mRNA is enhanced in mitogentcafty activated 
cells through modification of the start codon of an uOFtf 1 
en its 5'UTR using ah alternative promoter (Waterhouse 
et at, J Biol Chem 265:5585-9. 1990). Furthermore, 
modiTtcation of 5UTR through mutation, insertion or 
translocation events may even be implied in pathogen- 
esis. For instance, the fragile X syndrome, the most 
common cause of inherited mental retardation, is partly 
due to an insertion of multiple CGG trinucleotides in the 
5'UTR of the fragile X mFWA resulting in the inhfoitbn 
of protein synthesis via ribosome stalling (Feng et al 9 
Science 268:731 -4, 1995). An aberrant mutation in re- 
gions of the 5'UTR known to inhibit translation of the pro- 
to-oncogen e c-myc was shown to result in upregulation 
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of C-myc protein levels in cells derived from patients 
with multiple myelprnas (Vvltlis et al, Curr Top Microbiol 
Immunol 224:269^76, 1997). However, the use 61 otigo- 
oT primed cDNA libraries does not allow the isolation of 
complete SliTRs since " such obtained incomplete se- s 
quences may hot tnctude trie first exonof the mRNA T ~ 
particularly in situations where the first exon is short. 
Furthermore, they iri^y hot include some exdns, often 
short ones, which are located upstream of splicing sites. 
Thus, toere a nee^ to io 
the 5* ehds of mRNAs.^ 

[0007] White many sequences derived from human 
chromosomes have practical applications, approaches ' 
based on the lo^ntificaikih'and characterization of those 
chromosomal sequences which encode a protein prod- is 
uct are particularly relevant to diagnostic and therapeu- 
tic uses In some instances, the sequences used in such 
therapeutic of diag/K)stic techniques may resequences 
which encode proteins which are secreted from the cell 
in which they are syn^ 20 
proteins themselves, are particutarry valuable as ^ poten- 
tial therapeutic a^htsl Such proteins are often involved' 
in cell to cell cOTmuhication and may be responsible for 
producing a dinicafly relevant response in their target 
cells. In fact, several secretory proteins, including tissue ^ 2s 
plasminogen actiratdf/G^SR GM-CSF, eVymropoietflrt, 
human growth hcmmone; insulin, interferbn-d, interfer- : 
bn-p, interferon^, ana* tnterfeukin-2, are currently in clin- 
ical use. these prdteins are used to treat a wide range 
of conditions, including acute myocardial ; infarction, 30 
acute ischemic stroke, anemia, diabetes, growth hor- 
mone b^ficiehey; hepatitis; kidney carcinoma, cherrio- 
. therapy-induced neutropenia arid multiple sclerosis. £br 
the^ reasons, extended cONAs encoding secreted 
proteins or j^rt^ 3$ 
of Werapeutic agents. Thus, there is a need f or the kfeh- 
tiTicatibn and cfiaracterizatidn of secreted proteins and 
the nucleic acids encoding therh. 
[0008] In addition to being 1 fterapeuticalry useful 
themselves, secrefc^ short peptides, 40 

called signal peptides, at their amino termini which direct 
their secretion These- sighaf peptides are encoded by 
the signal sequenced located at the 5* ends of the coding x ' 
sequences of genes encoding secreted proteins. These 
signal peptides rcan be used to direct the extracellular 4S 
secretion of any 'protein to which they are operably 
linked. In addition, portions of the signal peptides'catled 
membrane-tran^ may also be used 

to direct the intracellular import of a peptide or protein 
of interest This may prove beneficial in gene therapy so 
strategies in which it is desired to deliver a particutar 
gene product to ceDs other than the cell in which ft is ; 
prcrfuced. Signal sequences encoding signal peptides 
also firki appliratk^ tri s^ protein purification 

techniques. In such applications; the extracellular' se- ss 
creubn of the desired protein greatly facilitates purifica- 
tion by reducing the number of undesired proteins from 
which the desired protein must be selected. Thus, there 



exists a need to identify and characterize the 5' portions . 
of the genes for secretory proteins which encode signal 
peptides. "■ ■■,■••»■.> 
[0009] Sequences coding for non-secreted proteins 
may also find application as therapeutics or diagnostics: 
In particular, such sequences may be used to determine 
whether an individual is likely to express a detectable ; 
phenotype, such as a disease, as a consilience of a 
mutation an the coding | sequence for a nc^-secreted I pro 1 
tein or tor a secreted protein; In instances where the in- v 
dividual is at risk of suffering from a disease or : other 
undesirable phenotype as a result of a mutatiori tri such 
a coding 1 sequence, the undesirable p^eriotype may be 
corrected by introducing a normal coding sequericd us^; 
ing gene therapy Alternatively, if the undesirable ge- 
notype results from overexpression of the pro4erri 'eri- : ' 
coded by trie coding sequence, expression 61 th^^prb 1 ^ 1 
tein may be reduced using antisense or triple helor * 
based strategies ' "' — i'* " v 

[0010] The secreted or non-secreted human polypep- 
tides encoded by the coding sequences may also be 
used as therapeutics by administering them directly: to: 
an individual having a condition, such as a disease, re- 
sulting frorn a mutation in the sequence encoding-theO 
polypeptide: In suctV an ihstahce, the cc«b^idn"c^ be" 
cured or ameRorated by administering the pblyfie^tide 
tothe Wrvklual. ' ; - 
£0011] In addition, the secreted or non-secreted hu- 
man polypeptides or portions thereof may "be used to lv 
generate antibodies useful in determining' the tissue 
type or species of origin of a biological sample. The 1 an- ' 
tibodies may also be used to determine the cellular lo- 
calization of the secreted or "nc^secreted -hurhan 
polypeptides or the cellular localization of jxSryp^ucfes 
which have beerY fused to the human polypeptides. In 
addition, the ^ antibodies may also be used in irnrnunioal- 
finity chromatography technkjues to isolate; pur^ or 
eririclv ihe hurhah ! poiypeptide or a target polypepticfe 
which has been I used to th^e' hurrah pofypepticte) 
[0012] Public information on the number of Kumar* 
genes for which the promoters and upstream regulatory 
regions have been identified and characterized is qliite 
limited. In part, this may be due to the difficulty of isblat- ! 
ing such regulatory sequences. Upstream regulatory 
sequences such as transcription factor blrkfthg sites? are : 
typically too short to be utilized as probes for isolating 
promoters from human' genomic libraries Recently; ; 
some approached ; have been developed to isolate hu- : 
man promoters. One etfihemconststs of making a CpG 
island library (Cross efal, , Nature GeheficsG: 236-244, 
1 994). The second consists of isolating human genomic . 
DNA . sequences containing Spel binding sites by the 
use of Spel binding protein. (Mortiock et al;, G&home 
Res! 6:327-335; 1996): Both of these approaches have . 
their Kmits due to a lack of specificity or because they 
are not universally applicable since only a limited 1 
number of promoters have either a CpG island or a Spe : I 
1 recognition site and because Spe I binding sites are 
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not specifically found in promoter regions. Thus, there 
exists a need to identify and systematically characterize 
the 5' portions .of the genes. ^ jnJ 

[0013] The present VEST? may 
identify and isolate 6'UTBs and upstream reguiatary re- 
gjorts which control the location, Developmental stage, 
rate, and quantity of protein synthesis, as well as the 
stabOjty of the mRNA. One© identiT^ character- 
ized, these regujato^ regtore^ utilized in gene 
therapy or protein purfic^tion sdiernes to obtain the o^e- 
sired amount and locations pi protein synthesis oc to in- 
hibit, reduce, or prevent fta synthesis jindes^r^e 
gene products. , 

[0014] In addition, £STs ^^ contairiing the 5* ends of pro- 
tein genes may include s^uences useful as probes for 
chromosprne mapping and lhe identification of ino^td- 
uals. Thus, there js a need to identify and characterize 
the sequences upstream of the 5" coding sequences of 



Summary of the invention ^ , 

[0015] The present invention, relates to purged, feo- 
lated, or enriched 5\^STs which include sequencesde- 
rived from the authentj^' e^ their correspbrKfing 
mRNAs. The term •corresponifing nnaNA" refers to the 
mRNA which was the ternplate for the cDNA synthesis 
which produced the 5VEST. These .sequences will be 
referred to hereinaner, as '5' ESTs." The present inven- 
tion also includes purified, isolated or enriched riuctefc 
ackJs wmprising contig^ assembled by ,<^ernibir^ ; a 
consensus sequences from a plurality of ESTs contain- 
ing overlapping sequences.; These contigs will be r re- 
ferred to herein as "consensus cc<iUg^ted ESTs.V 
[0016] £s used herein, the term *purified" does not re- 
quire absolute purity; rather, it is intended as a relatiye 
definitai.,^ 

library have been convention^^pM^ 

phoretic hcimogeneity. The sequences obtained from 

the^cto^^ 

the ftwrary or from total jtarnan Dr^ The cpNA clones 
are not naturally occurring as such, but rather are ofe 
tained via manipulation of a partially purtfied naturally 
occurring s^tartw (messenger W 
of mRN A into a cONA l^ary inyotves the creation of a 
synttietic substance (cDNA) and, pure imfiyidual c^ 
clones can be isolated from the synthetic fibrary by dpa- 
al selection. Thus, creating a cDNA ibrary from mes- 
senger RN A and subsequently Isolating individual 
donesfrom that Obiaiy resute hi an approximately 10*- 
10 6 lc« purification of trie native message ^urfltodk?n 
of starting material or natural material tq at least one 
order of magnitude, preferably two or three orders^and 
more preferably four or five orders of magnitude is ex- 
pressly contemplated. 

[0017] As used herein, the term "isolated" requires 
that me materia! be removed from fts. orig^rial t enyiror>- 
ment (e.g., the natural environment if it is naturally oc- 



curring). For example, a natwaltyKx^rrthg polynucle- 
otide present in. a Wing anrnal is not isolated, but the 
same ^polynucleotide, separated frbro^sofm.pr'aA^.^ 
, coexisting materials tt\ the natural s^ten\ is isolated. 

s [0018] As used herein, flie term "enrk^ecj" means 
that the 5* EST is adjacent to "backj^one" nucleic acfcj ;■■ 
to which rt is m>t adjacent in its natural.enw^m^ . 
o^ionaBy, to bp "enri^e^r the K E6fs will repesiept 
or mcire n a pqp- 

io ufatioh of nucleic acid backbone rr^^le?. B^kbdiW 
molecules aocording to the present tnventioh ihejude 
nucleic acte 

tng nucleic acids, viruses, integrating 
other V^i^prs^ 'nu^S.^cj^'ijs^S<> 'tjr^i^.^^\: 
is nipulate a nucleic acid insert ^ 

enriched ? ; ESTs represent 15%^ mc*e ^ 

c4 nucleic acid inserjsjn ; 

backbon e qio^^ 

ESTs represent 50% or rrore of nurj^ 
20 acid Jnsehs ^ th 

molecules In a hio/ily pW^ J 

nY^eoVffKTs^r^^ 

nucleic acjkJ inserts in the population of rec^bharit 

backbone molecules;, . . ... : . Vv: .. _ 

25 [0019] ^^^^ ; k>w" ix^ti^l 

uon corKfrtions are ^ as.defined beloi 

[0020] The (term "po^pWe" refers tp appl)^rof 

amino acids wi^ut re^d to the length jci^po^ipBr, 

thus, peptides, crtig are.inelup^d 
30 withln the defnitiqn ctfjp also does 

not specify or excluWpost^xpr^ 

polypeptides, for exarr^le, p^ 

toe cwalent attachments gly(cpsyl groups.. 

grcaips, pTK^phate groups, 1^ grou^ar^tf^ like^are 
35 expressly enc^passed by the; term pq^^A^e. Alsp 

incjup^ ^ 

contain one or rnore analogs of ^an ^^^^^<^^^^^ 
for example, npn-natyraiy^ arnir^ acids, amby. 

no ackls which only occur jnatu^aHy in • 

40 logical system, rnqo^ed ^ 

systerr»eteO,potyr^ S$N*es. a? i 

weli a^ other nipc^ica^ in i the art, bc^na^, 
ratty occurring ami r^ Wur^ w ^ 
: [00il] As used in^erdia^ 

45 ■nuderc aci^ 

otides" ; »^u^, RN A. 
quences of more th^ one 
chain or duplex fomi/^TTte teiTh Jlr^ 
herein as an aole^iyeto 

so Rrsl^ DN 9* %r 

length Wstr^le^ram^ form. Th4 term *tti- 

cteotide" ,is ^ ateo used herein as a noun to refer to 
vidua! nuctec^yes or varieties of nuc|e^io%, T mearwig ; 
a rrwlecule^cf unit in a larger nucleic add rr*ol- 

55 ecufe,cprnpnsingar^ 

oxyribo^ sug^ m and a phosphate .. group, or 
phosphoolester feikage in the case of nucleotides ^v^thih 
an oligonucleotide or polynucleotide. Although the term 
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•nucleotide" is also used herein to encompass •modified 
nucleotides" which comprise at least one modifications 
(a) an alternative faking group, (b) an analogous (orm 
of purine, (c) an analogous form ot pyrimidine, or (d) an 1 
analogous sugar, tor examples of analogous linking 
groupsTpurine, pyrimkJines, and sugars see for example 
PeTi^iicatfcn fto/Wb 95/04064. The polynucleotide 
sequences of the invention may be prepared by any 
known method, including synthetic, recombinant, ex 
vo generation, or a combination thereof, as wet) as at*f : 
tiziriig any j^rificatioh methods known in the art 
[0022] , The terms "base paired" and "Watsori & Crick 
b^se faired" are used interchangeably herein to refer 
to nucleotides which I can be hydrogen bonded to one 
another be virtue of their sequence identities in a rrjah- 
ner Rke that found eri double-helical DNA with thymine 
or uracil residues linked to adenine residues by two hy- ; 
drogen bonds and cytostne and guanarte residues linked 
by three hydrogen bonds (See Stryer, L, Biochemistry, 
4* i edftion; f995). ? - ' v 

[0023] The terms "c^jaiemehtary* or "complement 
mere^ axe used herein to refer to the sequences of 
polynucleotides which is capable of forming Watson & 
Crick base pairing with another specified polynucleotide- 
thrcwghdut the entirety o* the complementary region; 
For the purpose of the present invention, a [first polynu- 
cleotide is deemed to be complementary to a second 
polynucleotide when each base in the first polynucle- 
otide is paired with its complementary base. Comple- 
mentary bases are, generally, A and T (or A and U), or 
C ahd <3/"Cbmplement" is used herein as a synonym 
from "comptemehtary poiyhucleotide", "cdmplernentary 
nucleic acid" -and "cbmplemeritary nucleotide- se- 
quence". These terms are applied to pairs of pctynucle^ 
otides 'based solely -upon their sequences and not any - 
particular set of conditions uiider which the two polyhu- 
cleotides would actually bind- Preferably/ a "comple 11 
mentary* sequence is a sequence which an A at each ; 
position where there is a T on the ojpposite strand; a T 
at eitch position where: there is an A oh the opposite 
strand, a G at each position where there is a C on the 
opposite strand and a C at each position where there is 
a GorPttie opposite strand. ^ . . .> . 

[0024] TShus, & ESTs in cDNA libraries in which one ; 
or more 5* ESTs make up 5% or more of the number of 
nucleic acid inserts h the backbone molecules are "eri^ 
riched recombinant 5' ESTs" as defined herein. Like- 
wise,, 5* ESTs in a population of plasmids in which one 
or more 5* ESTs of the present invention have been in- 
serted such that they represent 5% or more of the 
nurnber of inserts in the plasmkJ backbone are "enriched 
recombinant 5* ESts"/as defined herein. However, 5' 
ESTs in cDNA libraries in which 5* ESTs constitute less 
than 5% of the number of nucleic acid inserts in the pop- 
ulation of backbone molecules, -' such as libraries lri . 
which backbone molecules having a 5' EST insert are 
extremely rare, are not "enriched recombinant 5' ESTs." 
[0025] In some embodiments, the present mention" 



relates to 5' ESTs which are derived from genes encod- 
ing secreted proteins As used herein, a "secreted" pro- 
tein is one which, when expressed in a suitable host ceH, 
is transported across or through a membrane, including; 
5 transport as a result of signal peptides in its amino acid 
sequence. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g. soluble proteins), or par- 
tially (e.g. receptors) from thVcell in which they are ex- 
press^. r "Secreted" proteins also' include without firhi- 

io talion proteins which are transported across the mem- . 
' brane of the endoplasmic reticulum. v -V 

[0026] ;Ir Such 5- ESTs include nucleic acid sequences, : 
called signal sequences, which erKXxfe^sip>^l peptides : 
which direct me extracellular secretion of the proteins 

is encoded by the genes from i which the i V. ESTs are < 
rived/ Generally, the signal peptides are located at the' 
amino termini of secreted proteins: * . : 
[0027] '* ? Secreted proteins are translated by rtoosomes 
associated with the "rough" endoplasmk: retculum 
Generally/ secreted proteins 'are* co-trahslationally 
' transferred to the membrane of the endoplasmic reticu- 
lum Association of the ribosome wim the endoplasmic ' 
reticulum during translation of secreted proteins is me- 
diated by the signal peptide. The signal peptide is typi- 

2$ cally cleaved following its co4ranstational entry intdlhe 
* endoplasmic reticulum. After delivery to ~ttie t endoplas- 
mic reticulum, secreted proteins may proceed through- ■ 
the Golgi apparatus; In the Golgi apparatus, the proteins 
may undergo post4ranslational rrid&ficatkxi before en- 

30 tering secretory vesicles which 5 transport them across 
the cell membrane. v.rl - 
[0028] The & ESTs of the present invention have sev- 
eral important applications. For example, they may be ; 
used to obtain arid express cDfsl A clones which include 

55 the full protein coding sequences of the cb7respohdirig : 
gene products, including the authentic translation 'start 
sites derived from the5* ends of the coding sequerices ~ 
of the mRNAsfrom which the 5' ESTs are derived. These ■ : 
cDNAs wfll be referred to hereinafter as lulHength cD- 

40 NAs/TbesecDNAs may also incTuo^ DNAq^rivedffom " 
mRNA sequences upstream of the ^ trarfelalwrtlstart site. 
The full-length cDNA sequenced may be Used to ex- ^ 
press the proteins corresponding to the ESTs. As <fis- » 
cussed abwe. secreted prctfefe 

45 teins may be therapeutically irnportant Thusv the pro- 
terns expressed from the cDIMAs rn^ be useful « treats ' 
ing or controiltng a variety of human conditions: The & ■ 
ESTs may also be used to obtain the cctrrespondihg ge- 
rxxnicDNA;Thetemi"cbne 

so fers to the genomic DNA which encodes the rnRNA from ' 
which the 5* EST was derived * 
[0029] Atterhatrvety, the 5' ESTs may be used to obr ' 
tain and express extended cDN As encoding portions of 
the protein. In the case of secreted proteins, the portions 

ss may comprise the signal peptides of the secreted pro- * 
teins or the mature proteins generated when the signal 
peptide is cleaved off . .m t * : . - : 7 ■ 

[0030] The present invention' includes isolated, pUhV 
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fied, or enriched "EST-related nucteicaci^^ sftions are consistent with ^ten^tt^.^.^pe^M^ 

"isolated",; "purified" or, "enriched" have me meaning ; EST related nucleic acids being referrcrikx Tliepreserit 

provided above As used herein, the term "EST-related invention also includes the, sequences cornpjemen^ 

nucleic acids - means the nudjetc acids of SEQ !Q NEte: . to the posfttonal segments of EST-related nudejc adds. 

24-4100 and 817873^1 /exteno^dcONAs $ [0O33J The present invention 

using the nucleic acids of SEQ ID hps: 24-4100 and purBied, or enriched fragments of ^ positional segments 

8178-36681 Jufl4ength cDNAs obtainable using me nu- of EST-related nucleic acids.'Thete^ "pu^ 

cfeic acids of SEQIONOs: 244100 and 8ire-36681 6r : rified" ; oc^enn^etThave the mearwig^pr^ 

genprntc DNAs obtainable using ^ ■ As tised herein, the term nrag^^ts erf rxttittonal seg- 

SEQ 10 NOs: 244100 and 6l78-36€» ^ Ihepreseht in- to merits of EST-refaited nucleic acte"^ 

ventMnalsd-incju^Jhe cornprising at least 10, 15,18, ^ 

the ^Trre^ nucleic acids. - 40, 50, 75, 100, 1^,or^c^^ 

[00311 tte^ also includes isolated,, the positional se^ri^te of EST-re^ 

purified or enr^ The present inyen^ ato 

acids.? The tefms. •isolate 15 con^lerr^ta/y to ^ f ragrnents of po^ioi^l s<e^ent^ . 

have the rneantrtg^o^scri>ed above. As used herein the of EST-r elated nudeic acids .. , ^ ..... , 

terrn ^ragnents of EST-rdated nucleic, acids" means [0034] : The present invention also ^ isolated or 

f ra^^s corrpns^ least 10, 12, 15, 18, 20,^,^, • purified ^ST:re«ated ppf^ 'isotat- 

28^35*40^50^ ed" or •purified" have the meanings provide^ above: As 

secutfve nudeptkles of me EST-related nucleic acids to *> used herein, the term "ESTVreiated pbt^epW3s" 

the extent trial fragments of these lengths are consistent . means the polypeptides encoded £v tho pS^reteted 

wim me lengths of the particular EST-rela^ nuctetc acto^,^ 

ids: toeing referred: to. The present invention also Jn-> ; NOs: 4101-Sf177: v , - ; • v r . 

etudes the sequefK^ comptementaiy to me fragments : [0035] The rxes^tinventwaa isolated or 

ofthefSTVelated^ 25 purffi^ ^ 

[0032] i TTie preserrt invention also includes isolated, terms isolated - or -puiified - haire ^ me me^bgs provid- 

r^iffied^ f ed above As used herein, ttjejterm^^ 

ed nudeic acii&^TheJer^ or related ppfyjpeptkJesV 

Vsnricrietfh^ least 5, 10, 15, 20, 25, 30, 35,, ^^^^.<^yk50 

herein, the term "positional segments of ESTrrelated nu- *> consecutive amino acio>of an ^ 

cleic acids" includes segments, comprising nucleotides to. the extent that fragments of these lengths are con- 

1-25,- 26t50, 51r75. 76-100, i1Q1-l25 t : 126-150, sistent with the lengths of the particular €SJ-reteted . 

151-175, 176-200, 201-225, 226-250, 251 -i)0, polvpep^efS bekig referredto. 

301-325. 326-350,; 351-375, 376b400, 401^425, [0O36] The presenVenvent^ 

426r4W,A 451^75, 476^, 50^1 -525, 526h550, 35 purified 

551H57§,: j^epo and 601*he terminal nucleotide of tides." As used herein, the term ^ositkxial segments of 

the EST-related nudeic acids to me extent that su^ hM' EST-related polypepti^ 

clec4io> positkxis are consist prising amino acid residues 1 -25, 26-50^51 ^75,^6-1^. , 

particular EST-related nudeic acids being referred to. 101-125. 126-150, 151-1 75*1 76-^^ >»1^ C^er : 

The;term/^itkx^^ mir^ amino acid of the EST-related polypeptide^ to tfto 

adds aiso irKtodes segments c^^ externa such amir»a^ 

1-50, 51-100, 101-150, 151-20Q, 201-^ »1-30fJ» the tengths of me particular ESTf etetet^ |^ 

301-350, u351-400, 401-450, ; 450-50Q, »1-^, being referred to. The term •pp^itional se^nents of ^EST- • 

551-600 or 6Q1 -the terminal nucleotide of me ESTfer related polypeptides also^ ^ 9 on V riSr 

fated nudeic acids to the extent that such npcleoUde pc^ . ing amirto acid residue^ 1 51-100, 101-150, 1 51.t200 

sitiom are cogent with the lengths of the particular or 201 -the C-tecroinar amir^ 

ESt-retated nudeic acids bemg referred to. The term r^pVpeptides to the etf ent that su^^ 

>>srtk)nai segmem are consistent vyim me le^^ of the ^r^cular ESI-re- 

mcludes ^ segments cornprising ; nucleotides ^1-100, ; fated pc^ypeptides being referred ^to. The term "r^)s^-: 

101-200, 201-300, 301-400, 501^500, 500-600, or, so al seg^nents of EST-related polypepticte^ 

601 -the terminal nucleotide of me EST-reiated nucleic segrr«rits comprfe 

adds to the extent that such nucleotide positions are me EST-related poJypeptides to me ^ent ^ such 

consistent wfth the lengths of me particular EST-relaied arntno acid residues are ^ consisted *^ 

nucleic acids being referred to : In addition, the terni "pOr particular EST-related polypeptides being referred to, In 

sitwnal segments of EST-rejatednucteicac^ & aoVJitiorti the term "positional segnients 

segjnents compristng nucleotides; 1-200, 201^400, rx»lypeptides" inducJes se^nents con^*^ 

400-600, or 601 me terminal nucleotide of me EST-re- id residues 1-200 or 201-me G-term"^ am^ acid of 

fated nuctetc acids to me extent that such nudeotide po- me EST-related polypeptides to the extent that amino 
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acid residues are consistent with the lengths of the par-, 
ticutar EST related polypeptides being referred to. 
[0037] . The present ariyention also includes isolated or 
pufifieb fragments of pc^ionai segments ot EST-relat- 
ed porypept'tb^/Tr^^^ 5 
the meanings provided above As used herein, the term 
'fragments of positional segments of EST-related. 
polypeptides" means i rag/nents comprising at least 5. 
l6! 1 S.^p, 25. 30;.35; 40. M,>5^ ICO, 0} 15p consecii- 
trye amtrib acids o( jxfcsitbnal segments of EST-related 10 
pc^fypeptides, to the ^extent that fragments of these 
lengths are consistent with toe lengths of the particular, 
EST-related polypeptide^ ^ 
[0038] The present flrwbntm 

which specrticaliy recognize the is 
tides, fragments of EST-related po^^ " ; 
segments of EST-related pb^peptttfes, or fragments of 
' m '" : • x ' - ihtfie 



case of secreted prcrteins, such as ^ 
7^-7^ ^tibo^ 

mature pr^^ peplicle is 

cleaved may also be obtained as described' below Sim- 
ilarly, antibodies which speciiicafly recognize the signal 
pef^ri^of SEQfjp 



also be obtained. ....... . f ..V 

[003§] In some emba^eri&a^ 
creted - proteins, the j= ST-rejated ^ n^eb 5 acids, t&Qr. 
merrts of EST^fe P^'ioViaii^p^Ms^ 
of EST-related nucleic ac^/ 6f po^iona!' 

segments of nucleic acids include a s^g^rs^uer^e. In & 
other embodiments, the EST-related nucleic ^ 
ments erf EST-related nudeic acids, r^itionaJ segments 
of EST^-related nucleic acids, or fragments of rational 
segments of nucleic acids may delude the tall coding 
sequence jor thepr&em 35 
teins, trie full cooing sequence of the mature protein' (L 
e. the protein generated when the signal poryp^ptide is 
cleaved off). In addition, li)e EST-related hudeic aafe, ' 
fragments of EST-related nucleic acids, posit wal seg- 
ments of EST-related nucleic adds, or fra^erits bl p^ 40 
srUohaJ segm 

tory regions upstre^ stalsftf bC 

dovmstream of the s^^"codcVi ' which control the 
amount, location^ ipr devebr^ental ^ gene ex- 

pression. _ [ [ [ ' .. * ' ^ "• / ' [ ' ' 4S 

[0040] As discus^ arid non- 

secreted Human proteuis may be therapeutically ^ 
tanf . Thus, the pnSetns expressed from the ESPVelated ' 
nucleic acids, Ira^ents^ EST-related nucleic acids, 
positional segments of ES^related nucleic acid£ ! or so 
fragments of f^rlk^ se^nehts of Jiucjefc'aciio^r^y 
be useful in treat ihg "or cbnUdling a variety 1 of human 
conditions ' v "* ' : ' " !*/'" 
[0M1] The EST-related nucleic acids, fragments of. 
ESTnrelated nucleic acids, rxjsitional segments of EST- ss 
related nucleic acids, . or fragments of positional seg- 
ments of nucleic acids rriay hp used in forensic proce- 
dures to io^tfry Wrviduals or in diagnostic procedures 



to identify indiyiduats having genetic diseases resulting 
from abnormal gene expression, in addition, the EST-, 
related. nucleic acio^, fragments of EST-related nucleic, 
acids, positional segments of EST-related nucleic acids, 
or fragments |^^^itroral se^nents of nucleic acids are 
useful for censtrucu^ a high resolution map of the hu- 
man chfomc^s^es. ^ 

[0042] . T^ present trwerition also relates to secretion 
vectors cajpabtecrf 0%^ secretion of a prc4eiri of 
interest Such vectors may be used in gene therapy 
strategies in which it is desired to produce a gene prod- 
uct in one cell which is to be delivered to another location ; 
fin the body. Secretion vectors may also facilitate the pu : , 
riftcation of o&ired proteins ] ' . T 

[0043] The prWent invention' also relates to expres- 
sion vectors capable of f directing the expression or an ' 
inserted gene ih a desired spatial or temporal manner 
oratlajiestre^^ 

es upstream of theESf-r elated nucleic ackis,fragm^ts 
of EST-related nucleic acids, positional segments of 
EST-related nucleic acids, or frac/nents of posjttonal 
segments pf nucleic acids, such as promoters' or up-, 
stream regulatbry sequences ^ ! ' : , i 

[0044] The present invention also comprises fuswh 
25 vectors for rratorig^ pdrypeptides com^ 



first polypeptide and a second polypeptide; Such vec- 
tors are useful for determining the cellular localization 1 • 
of the chimeric pdt^ptktes or for tsolatiig; purffyirig 1 or 
enriching the c^i^ r J ! J 1 f l y\ ; : ; 

[0045] r The ESt-retatetr nucleic acids:, f ragments of 
EST-related nucleic acid?, positional segments of EST: 
related nucleic acids, w fragments c^>pc«itbhal seg-" 
ments of nucleip satis', rnay also be used ior §ehe ther^ 
apy to control dr freal ^rietic ^ diseaseis^ih the case of 
secreted proteins, signal peptides may be fiised t6;het-' : 
erblogpus proteins to direct their extracetiular secretion/ 
[0046] Bacteralctones containing Btu&c^pt plasmids 
havrig inserts containing the sequence of the non-clus- 
tered ^ESTfe^ire ^presehtry^sfered at 80>C ^ iri 4% (V/vj; 
glycerol in the inventor*s lab^oratc^s under the c 



comprise 1 a single EST from a single' tissue in the Rstirtg ^ 
pf Table fl. Theinsertsrnay be recovered frorri me stored 
materials by ^wirig the T appropriate dories ^"cVi. a suita- / 
ble rWeMu^; Th^^^ 

usin^ ptasrnid isolation procedures farriiliar to those ' 
skilled in the art such as alkaline lysis minipreps or large 
scale alkaline lysis plasmid isc^tbri proc«^resJ If xle^ r 
sired the plasmid DNA may be turner enr^ 
trifugation on a ce^iurii chloride gradient, size exclusion 
chrprmtograj^, or anion exchange chrbrratography. 
The plasmid DNA obtained using these p^ may' 
then be rMniputated lisirig stan^ ' 
familiar to those skilled in tfie artif Afterratryery, a PCft 
can be done with primers designed aV tKith ends of the - 
inserted EST-related nucleic acio^ of EST-' 

related nuclei acids, positional segments EST-relat- 
ed nucleic acids, or fragments' of positional se^h^rils'bf 17 
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nucleic acids. The PCR product ^ich coaesppnds to 
the ESTPf elated nucleic acids, f rap^ents of EST-re&ted 
riuciefc. acids, posftional segments of EST-retated nu- 
cleic acids, or fragments of positional segments of nu- 
cleic acids can then be mantpubted using standard 
cloning techniques familiar to those skilled in the art 
[0047] Oneembotfment of the present mvenU^ is a 
purified nucleic acid comprising a sequence selected 
from the group consisting of SEQ I D NOs: 24-41 00 aryl 
SEQ I D NOs: 8178t36681 and sequences cpniplemeh- 
taiy tomese^uencesrf^ 

ID N0s: 8178-36681 . ... ., , : 

[0048] Another ernbdcTBTrient of the present invention 
is a purified nucleic acid comprising at least 10 consec- 
utive nucleotides of a sequence selected from the group 
consisting) of SEQ I D NOs : 24-41 00 and SEQ ID NOs: 
8178-36681 and sequences cprr^terrientary to the se- 
fences of SEQ ID NOs; 2^100 and SEd ID NOs: 

8178-3(^81. . . .. -, M "^.Z 

[0049] ! Another embodi^^ of the present invention 
is a purified nucieic ac^ 15 consec- 

utive nucleotides of a sequence selected from the group 
consisting of SEQ ID NOs: 24^1(X) and ^E<i ID NOs: 
8178-36681 and seo^jences cc^lemehtary to the se- 
quences of SEQ ID t^: 24-41 SEQ ID NOs: ■ 

817^36681. ' ' . . . . 

[0050] A further embedment of the present byentipn 
is a purified nucleic acid cprr^risbg the codfing se- 
quence of a sequence selected from the group obrKStst- 
ingpf 24-4100! ... . 

[0051] Yet another embp^eht of the present inven- 
tion is a purified nucteb^aetd c^ thefufl ceding 
sequences of a sequence selected from the group con- 
sisting of SEQ lb NQs: 3^1-3811 ^ yi^rein ^e f ull cod- 
ing se^ence comprises the sequence encod^g the 
stgnalrjeptitie and the sequence ^ the rriature 
protein. ( ., : , ;< . : .,. : . : .• jr ..,- . . .. 

Stilancrtner emt^ invention is a 

purified nucleic a^ co^rising a contiguous span of a. . 
seque^ sele^ed from of SEQ ID 

NQsJ^2lT^1twr^ 

[0052] Arw^ of the pre^f ^yention 

is a purified nucleic acid cor^rtsing a cc>rrt|guous span 
of a sequence selected from the group core^irig of 
S^I0 Nps;^ encdo^the 
signal peptide. ... ... ., : . / . . 

[0053] Another embo<^ent of the present niiyentk>n 
is a punned nuc^ 

prising a seo^erra ^ from the group consisting 

oftoseo^e^^ 

[0054] And^ 

is a purified nucteic actt corn- 
prising a sequence selected from the grqup consistmg 
of the s^uences of SM JD NQs: 77^t088. 
[0055] Another emb^^ 

is a purified nucleic acid encoding a polypeptide com- 
prising a mature protein included in a seo^ence selected 
from the group consisting of the sequences of SEQ ID 



NOs: 7798-7888. 

[0056] Another embodiment of the present invention 
is a purified nucleic acid encoding a polypeptide co>rh 
prising a sig/ialpeptide included in a sequence selected 

s from the gfoup ccf^istihg of the sequences of SEQ ID 
NOs: 4101-4729 and 77^-7888: 
[0057] Another embodiment of the presertf irwentiph 
is a purified nudeic acid icit least 15,18. 20, 23, 25, 28; 7 
30^ 35, 40, 50, 75. 1 6b; ^ 300, 500 or 1000 nucte- 

10 otides in ierigth whid> hyl^izes under stringent cod^ : " 
tions to a sequence selected, from the group consisting 
of SEQ ID NOs: 24^100 and SEQ ID NOs: 8178-36681; 
and sequences ©ompl^^tary to the sed)jerices of 
SEQ ID NOs: 24^41 66 arid ^EQ ID N^j 81 78^36^1 J * 

15 [0058] Another enrt)pd^^ 

is a purified or isolated pbtyjpeptide cor^nsing a -Se- 
quence selected from the group consisting of the se- 
quences iof SE<i ID fjfcfe: 01 -81 77 f " 7 ; 

[0059] yyK>the^ 

20 is a purified or feblal^ p^^ a . , 

quence selected jrorri me group ^ 
NOs: 77^7888- " V'.IV! T-"'?' 
[0060] ^Tother em^in^t of the preserit irwerition 
is a purified or isolated polypeptide cwrtprisir^ a rnature 

25 protein of a polypeptide selected from the group ^ cph: 
sistmgofSEQ^ ^ 
[0061] Aikither ^ of the preserit invention 

is a purFi^pr W 

peptide erf a sequence seiede^ f rom'the group consist- 
30 wig of the pblyr^tides ^ id rfOs: 4101-47^ 
7798-7888, ' * .''^ . , - 

[0062) Ar^ 

is a purified or isolated pplypepticle cbrrip^ing at teast 

10 executive anirki a^ 
35 from the group consisting of the seo^uei^s b^ 

NCte: 4^01-8177. ^ ' . . ^\ 

[0063] An^er erni^iment o^ 

is a method rnal^ a cDNA ccyr^nsing the steps of ^ 

cbhtactinp;awH^^ 
40 cells ^ with ^ a primer comprising at least 15 executive 

nutte^des-<^^ 

sist^ig of the s^uehces con^lerrWntaiy to SEQ ID 
NQs: 24^1 00 and SEQIDfNQte: &1 ^36^ , jTy#»P^ 
ing said primer to an mRNA in said colle^bh $M en- 

45 codes said protein reverse transcribing said hybr^fZ^ci ; 
primer to rr^e a first cDNA strand ^m^'s^.n^iiK: 
maldng a ^second cDN A strand cornpj^ 
first ^ ctJNA strand and fe^^^ thb re^ltirig ^c6r^ erV; 
ceding said protein compri^ first cDNA strand 

50 and said second cDNA '^^....'^J,;;'^' , j 
[OC^J ^ 

is a purified cDNA obtainable by trW method of ttW^re- 
cedtng paragraph. . 
[0065] (n one aspect of this erribb^ent. the 
ss encodes at least a potion of a htiman, pbjyjF^tii^. 
[0066] Anp^hef emj^intoit of the jpre^t 
is a method of maldng a ct)NA cpr^ffe^ig' trfe s^ejpf 9? 
obtaining a cDNA comprising a sequence selected from 
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the group consisting of SEQ 10 NOs: 24-4100 and SEQ 
ID NOs: 8178-36681, contacting said cONA with a de- 
tectable probe comprising at least 15 consecutive nu- 
cleotides of a sequence selected from the group con- 
sisting of SEQ ID >IOs: 24-4100 and SEQ ID NOs: s 
8178 1 36681 and the sequences complementary to SEQ 
ID NOs: 24^41 00 and SEQ ID NOs: 8178-36681 under 
conditions which permit' said probe to hybridize to said 
cDNA, identifying a cDNA which hybridizes to said de- 

. tectable probe, and isolating said cDN A which hybridiz- ?o 
es to said probe: i f ' \ 

[0067] Another embodiment of the present Invention 
is a purified cDNA obtair^le by the method of the pre- 
ceding paragraph. • "v. 
[0068] In one aspect of this embodiment, the cDNA 1$ 
encodes at least a portion of a human polypeptide: 
[0069] Another embodiment of the present invention 
is a method of making a cDNA comprising the steps of 
contacting a collection of mRNA molecules from human 
cells with a first primer capable of hybridizing to the 20 
poryA taH of said rnRNA. : hybridizing safcJ first primer to 
said poly A taM,.reverse transcribing said mRNA to make 
a first cDN A strand, making a second cDNA strand com- 
plementary to said first cDNA strand using at least one : 
primer comprising at least 15 consecutrve nucleotides 2$ 
of a sequence selected from the group consisting of- 
SEQ ID NOs: 24-41 00 and SEQ I D NOs:8t 78-36681, 

. and isolating the resulting cDNA comprising said first 
cDNA strand and said second cDNA strarKt *i )f 
[0070] Another embodtment of the present oivention so 
is a purified cDNA obtainable by the method of the pre- 
cedirig paragraph. : ^i.v.:^ - 
[0071 ] In one aspect of this embodiment, said cDN A 
encodes at least a 1 portic^ of a human polypeptide- 
[0072] > In another aspect of the preceding method the'?' 35 
second cON A strand is made by contacting said first cD- 
NA strand with a first pair of primers, said first pair of 
primers comprising a second primer comprising at toast 
1 5 consecutive nucleotides of a sequence selected from , 
the group consisting of SEQ ID NOs: 24-4100 and SEQ ■ 40 
ID NOs: 8178-36681 and ta mird primer raving a se- 
quence therein which e included within the sequence oH 
said first primer, performing a first porymerase chain re- > 

. action with said first pair of printers to generate a first 
PCR product,' contacting said first PCR product with a 4S 
second pair of primers, said second pair erf primers com- - 
prising a fourth primer, said fourth primer comprising at > 
least 15 consecutive nucleotides of said sequence se- 
. lected from the group consisting of SEQ ID NOs: \ 
24^l60arkJSEQ1DI^:817^36681;and so 
er, wherein said fourth and fifth hybridize to sequences 
within said first PCR product, and performing a second 
rxrfymerase chain reaction; thereby generating a sec- 
ond PCR product 

[0073] -One aspect. of this ernbodiment is a purified ss 
cDNA obtainable by the method of the preceding para- 
graph. * -v-i- ; ■ ' 

[0074] In another aspect of this emboolmerit, said cD- 



NA encodes at least a portion of a human polypeptide. 
[0075] Attematrvety, the second cDNA strand may be 
made by contacting said first cDNA strand with a second 
primer comprising at least 15 consecutive nucleotides 
of a sequence selected from the group consisting of 
SEQ ID NOs: 24-4100 and SEQ ID NOs: 8178-36<S81, 
hybridizing said second primer to said first strand cDNA, 
and extendbg said hybridized second primer to gener- 
ate said second cDN A strand. 

[0076] One aspect of the above embodiment is a pu- 
rified cDNA obtainable by the method of the preceding 
paragraphs ^ - 1 * ' 

[0077] In a further aspect of this embodiment said &> 
NA encodes at least a portion of a human polypeptide. 
[0078] Another embodiment of the present invention 
is a method of making a polypeptide comprising the 
steps of Obtaining a cDNAwhich encodes a polypeptide • 
encoded by r a nucleic acid <xxnpris'rig a sequerxe se- 
lected from the group consisting of SEQ ID NOs: 
24-4100 or a cDNA which encodes a polypeptide corny 
prising at least 10 consecutive amihoacids of ^ polypep- 
tide encoded by, a sequence selected from the group 
consisting of SEQ ID NOs: 24-41 C)0, inserting said cD- 
NA in ah expression vector such that said cDNA fe'op-. 
erably linked to a promoter, introducing said expression 
vector into a host cell whereby said host cell produces; 
the protein encoded by said cDNA, and isolating said 
protein;-'- -.- • ' 

[0079] Another aspect of this^embodiment is an feo- ,. 
fated protein obtainable by me method of the preceding 
paragraph, • / \v 

[0080] Another embodiment of the present invention 
is a method of obtaining a promoter DNA comprising the 
steps of obtaining genomic DNA located upstream of a : 
nucleic' acid comprising a sequence selected*! rem the , • 
group consisting ^ of SEQ I NOs: 24-4100 and SEQ ID 
NOs: 8178-36681 and the sequences complementary 
to the sequences of SEQ ID NOs: 24-4100 and SEQ ID ; 
NOs:, 8178^36681, screening said ger»omic DNA to i 
identify a promoter capable of directing transcription in- 
itiation, and - fc.-.s.';: . .[ • <'{';>•• 
isolating said DNA comprising said identified promoter 
[0081] In one aspect of this embodiment, said obtain- 
ing step comprises walking from genomic DNA compris- 
ing a sequence selected from the group consisting of. 
SEQ ID NOs: 24^100 and SEQ ID NOs: 8178-36681 
and the sequences complerneritary to SEQ ID? NOs: 
24^4100 and SEQ ID NOs: 8 1 78-36(581 . In another, as-, 
pect of this erribediment, said screening step comprises 
insertffig genomic DNA located upstream of a sequence 
selected from the group consisting of SEQ ID NOs: 
2:4-4100 and SEQ 10 NOs: 8178^36681 and the se : 
querx^s ccpmplefrmntary to SEQ ID NOs:j24~4100 and 
SEQ ID NOs: 8178-36681 into a promoter reporter vec- 
tor. For example, said screening step may^ comprise \ 
identifying motifs in genomic DNA located upstream of 
a sequence selected from the group consisting of SEQ 
ID NOs: 24-4100 and SEQ ID NOs: 8178-36681 and the 
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sequences coo^l^nentary fo SEQ ID NOs: 24-4100 
and SEQ ID NOs: 8178-36681 which are transcription 
factor binding sites or transcription start sites: 
[0082] Another embodiment oiuro : present invention 
is a isolated promoter obtainable by the method of the s 
paragraph above. ;; ^ 
Another ernbodirnent of the preserrt-invention is the in- 
clusion of at least one sequence selected from the group 
consisting of SEQ JO NOs: 24-4100 arid SEfe) ID NOs: 
8178-36681 , the sequences Cttrnplementary to the se- to 
quences of SEQ ID NOs: 24-4100 and SEQ ID NOs: 
8178-36681 and fragments comprising at feast 15con- 
secutive nucleotides of said seo^jence inanarray of dis^ 
crete ESTs or fragments thereof of at least 15 nucle- 
otides in length! In some aspects of this embodiment. « 
the array includes at least two sequences selected from 
the group consisting of $EQ ID NOs: 24-41 0Q and SEQ 
ID NOs: 8178-36681 , ttw sequences complementary to 
the sequences of SEQ ID NOs: 24-4100 and SEQ ID 
NOs: 8178-36681, arnJ fragments comprising at least 15 
consecutive r>uclec4io^ c4 saSd sequences. In another : 1 
aspect of this embodiment; the ; array includes aMeast 
five sequences selected from the group consisting of 
SEQ ID NOs: 24-4100 and SEQ ID Wte: 8178-36681, 
the sequences complementary to the sequences of 2S 
SEQ ID NOs: 24-4100 and SEQ ID NOs: 8178-36681 
and fragments comprising at least 15 consecutive nu- 
cleotides of said sequences. 

[0083] Another embodiment of the present invention^ 
is an enriched population of recombinant nucleic acids, 30 
said recombinant nucleic acids compris ing an insert nu- 
cleic acid and a backbone nucleic acid, wherein at least 
5% of said insert nucleic acids In said population com- 
prise a sequence selected from the group consisting of 
SEQ ID NOs: 24-4100 and SEQ ID NOs: 8178-36681 35 
and the ■ sequences corr^lementary to' SEQ ID NOs: 
24-4100 and SEQ ID NOs: 81 78-361681/ 
[0084] Another embodiment of the present invention 
is a purified or isolated antibody capable of specialty 
binding to a polypeptide comprising a sequence select- *o 
ed from the groijp ^ ccmlsting of SEQ ID NOs: - 
4101-8177, ; ; vvVA ; v ; : " 

A purified or isolated anti^^ specifically 
binding to a polypeptide cornprisffig at least lO.consec* 
utiveamTOacklsda^s^ 45 
consisting of SEQ ID NOs: 4101-8177. 
An antibody composition capable of selectively binding 
toan epitope-cbntainingfragrr^ a pbjypeptkte com- 
prising a ccotiguous span of at least 8 amino acids of 
any of SEQ ID NOs: 4101^8177/whereki said antibody so 
is polyclonal or monoctcriaL ' 
[0085] Another embc<fiment of the present iwentidri 
is a computer readable medium having stbred thereon 
a sequence selected from tfie group consisting of a ht*- 
dec acid code of SEQ ID NOs: 24-4100 and ss 
8178^-36681 and a polypeptide code of SEQ ID NOs: 
4101^177. * ^ 

[0086] Another embodiment of the present Invention 



is a computer system comprising a processor and a data 
storage device wherein said data storage device has 
stored thereon a sequence selected from the, group con- : 
sisting of a nucleic acid code of SEQID NOs: 24-410Q 
and 8178-36681 and a polypeptide code of SEQ ID y 
NQs: 4101 ^8177;; In one aspect of this embodiment the 
computer system further comprises a sequence ,conv, : 
parer and a data storage device having reference se- 
quences stored thereon. For example, the sequence 
comparer may comprise a computer program which tv. / 
dicates polymorphisms. 

In another aspect of this erribodiment, me computer sys- 
tem further comprises ah identifier v^k^ ictentaies fea-. 
tures in said sequence. ,, r ^ / 

[0087] Another embodiment of me present invention 
is a method for comparing a first sequence to a refer- 
ence sequence wherein said first sequence is selected 
from the group consisting of a nucleic acid code of SEp 
QID NOs: 24t4100 and 8178-36681 and a polypeptide 
code of SEQ ID hps: 4101^8177 cpmpris«>g the ^e^S/, 
of reading sakJ first sequence and said reference ser 
quence through use of a computer program which cori> : 
pares seo^ences and determin'rig differences between 
said first sequence and said reference sequence with 
said computer program. In some aspects of this erribock 
tment, said step of determining plfferences between the ; 
first sequence and the reference seo^ence comprises 
identifying polymorphisms. ; ,„. r ; y-;:\, : • 

[0088] Another embodiment of the. present mention ; 
is a method for identifying a feature in a sequence se-v 
lected from the group consisting of a nucleic acid code s 
of SEQID NOs: 24-4100 and 8178-36681 and ya y 
polypeptide code of SEQ ID NOs: 4101 r817^7 comprise 
"mg the steps of reading said sequence throup^i the use ; 
of a computer program which identifies features in se- 
quences and Identifying features in said sequence wjth ^ 
said computer program. , 
[0089] Another embodiment of me present invention ; 
is a vector compristng a nucleic acic? accptding fo any 
one of the nucleic acids described above. :.; < , 

[00901 ^Another embodiment of the present invention; 
is a host celltCCfifaining the above vector. , 
[0091] Another enrtbodiment of the present invention; 
is a rriethodc* malting an^ 

above comprising the steps of introducing said nucleic : 
acid into a host ceH such that sak* nucleic acid is present 
in multiple copies in each host cell and isolating said 
nuctek acid frcfn said host ceS. 
[0092] Another ernbodirnent of the preserrt invention^ 
Is a method of making anudeic acid of any of the nucleic 
acids described above comprising the step of se^ert- - ■ 
tially Bnkmg together the nucleotides in said nucleic ac- 
ids. -•; 
[0093] . Another embodiment of the present invention: 
is a method of making any of the rx>lypeptio^ described 
above wherein said polypeptides is 150 amino acids v\ 
length or less compristng the step of sequentially finking 
together the amino acids in said polypeptide 
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[0094] Another embodiment of the present invention 
is a method of making any of the polypeptides described 
above wherein said polypeptides is 120 amino acids in 
length or less comprising the step of sequential V tthking 
together the 'amino acids in said polypeptides. ' s 

Brief Description of the Sequence Listing 

[0095] SEQ ID NOs: 1; 3, 5, 7. 9, 11, and 13 are fulh 
length cDNAs prepared using the. methods described' to 
herein^ "" \ "' • : ' v 

[0(096] ;/ iV SEblD'NiDs:.2r4, 6, 8, 10, 12, and 14 are the 
pofypeptides encoded by the nucleic acids of SEQi ID : 
NOs: 1,3, 5,7, 9, if; and 13. : 
[0097] SEQ ID NOs. 15, 16. 18. 19. 21 and 22 are is 
primers whose use is described in the specification: : 
[0098] SEQ ID NOs: 1 7, 20, and 23 are the sequences 
of nucleic acids containing transcription factor binding 
sites which were obtained as described below. 
[0099] SEQ ID NOs: 24-652 are nucleic acids haying 20 
an trtcomptete OfVi which encodes a signal peptide'. As 
used herein, "incomplete ORF* is an open reading 
f rame tfi which a start codon has been identified but ho 
. stop codon has been identified. The locations of the in- 
complete ORFs rand sequences encoding signal pep^' 2s 
tides are Gsted in the accompanying Sequence Listing: 
In ackJition, the von Heijne score of the signal peptide 0 
computed as described below is fisted as the "score* in ' ' 
the accompanying Sequence Listing. The sequence bf 
the sigriat-peptide rs listed as "seq" in the accorrtpan^rttj^ so 
Sequence Listing The V" ih the signat peptide sequence 
indicates the location where proteolytic cleavage of the 
signal peptide occurs to generate a mature protein. v - 
[OfOO] Sra iD NOs: 653^3720 are nucleic acids hav- 
ing an in(x»(Tipiele OW in which no sequence encoding ' 35 
a signal fjeptide has been identified to date. However, ; ft u 
remains possible that-subsequent analysis will identify 
a sequence encoding a signal peptide in these nucleic' 
acids. The locations of the incomplete ORFs are listed 
in the accompanying Sequence Listing. ? 5( ^ 40 
[0101] SEQipNOs:3721-3811arenudefe^ 
inga complete whk^ encodes a signal peptide As : 
used herein, a •complete ORF" is an open reading f rame * 
in which a start codon and a stop codon have been iden- 
tified The locations of the cc^ 45 
es erK^hg signal peptides are fisted in the accompa- 
nytng Sequence Listing In aoVfifton, the von Heijne 
score of the signal peptide computed as described be- : 
low is listed as the "score' h tte accom 
quence Listing, trie sequence of. the "sigrat-pepTide is so 
listed as "seq* in the accbmpanying Sequence Listing. 
The *r in ^ signal peptide! sequel 
cation whefe^proteotytic cleavage of the signal peptide 
occurs' to generate a mature prctein, ' "^ 
[0102] SEQ ID NOs: 3812^100 are riideic aacfs S5 
having a complete ORF in which no sequence encoding '* 
a signal peptide > has been identified to date: However; it 
remains possible that subsequent analysis will identify 



a sequence encoding a signal peptide in these nucleic 
acids. The locations of the complete ORFs are listed irv 
the accompanying Sequence Listing ' : 
[0103] SEQ ID NOs: 4101^729 are; -^complete 
polypeptide sequences" which include a signal peptide. 
Incomplete jpc^peptide sequences" are pc^ 
quences encoded by nucleic acids in which a start co- 
don has been identified, but no stop codon has been 
identified. These polypeptides are encoded by the nu- 
cleic acids of SEQ ID NOs: 24-652. The location of the 
signal peptide is listed in the accompanying Sequence 1 
Listing. In addition, the von Heijne score of the signal 
peptide computed as Described below is listed as the 
"score" in the accompanying Sequence Listrng. the se- 
quence of the signal-peptide is fisted as "seq" tn the!ac- 
compahyihg Sequence Listing. The 7" in the signal per> ; 
tide sequence indicates the location where proteolytic 
cleavage of the sigpial peptide occurs io generate a ma- 
ture protein. * :v ' u " * * 
[0104] SEQ ID NOs: 4730-?797 a^ 
polypeptide sequences in which r» signal peptide has 
been identified to date. However, If remiaihs po^ible : 
that subsequent analysis will identify a signal peptide in 
these polypeptides. These polypeptides are encoded bjr x 
the nucleic acids of SEQ ID NOs; 653r3720. 
[0105] SEQ ID NOs 

polypeptide sequences" which include a signal peptide. 
"Complete polypeptide sequerrces" are polypeptide se- 
quences encoded by nucleic acto^ ih'whi^ 
don and a stop codon have been identified. These 
polypeptides are encoded by the nucleic acids of ^EQ 
ID NOs: 3721-3811. the location of the signal peptide 
is listed in the accompanying Sequent Ustihg: In ad- 
dition, the von Heijne score of the signal peptide com- 
puted as described below is listed as the "score" irY trie 
accompanying Sequence Listing, The sequence of the 
signal-peptide is listed as "seq" in the acccropartying 
Sequence Listing. The V in the signal peptide sequence 
indicates the location where prbteorytic cleavage of the 
signal peptide occurs to generate a mature prbteih. j 
[Olbq SEQ ID Nds: 7889-8177 are - cooylete 
porypeptide sequences in which no signal r>epfide has ' 
been identified to date; However, it reVhains posstole 
that subsequent analysis Will identify a signal peptide n 
these porypeptic^s These polypeptides are encoded by 
the nucleic acids of SEQ ID N^:t38i2-4iOO. 
[0107] SEQ ID NOs: 8178-36681 are riucteic acid [ se- 
quences in which ho open reading frame has been c^-' 
clusivety identified tb date/However, it rernains possible 
subsequent analysis win identify an open reading frame 
in these nucleic' acids. 1 7 * ' - 

[01 08] In tHe accompany tng Se^^nce Listing; all in- . 
stances of the symbol "ri" in the nucleic actd sequences 
mean that the nucleotide can be adenine; guanine, cy- 
tosine or thymine. In some instances the pofypeptide se- 
quences in the Sequence Listing contain the symbol 
"Xaal" These "Xaa" symbols indicate either ( 1 ) a residue* 
which cannot be identified because of nucleotide se^*' 
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quence ambiguity or, (2) a slopcodoo ' in the determined quences (prokaryotic or fungal), the presence of bacte- 

sequence wtiereap$^ s rii i^rtoosom^ 

(if the sequence were determined mofeaccuratefy) In fungatniR^^ : : 

some instances, several possfote identities <A the un- [01117] Fottowfng preparation of the ^.i^_FfMA&.iffp^.^^^;/ 

known amino acids may be suggested by the genetic ;t $ ious tissues ah ctigbnucJeotkie tag was specificalry at- 

code. . ? ; • tached to the at Jhe ^ 

ofigonudeot^taghadanEc^ 

Brief Description of the Drawings later cloning procedures. Fc4towingattacrime^tc4 the o^ 
";;_. l _ : ". : , ( , -.;>,; ^ -WS^^ 

[0109] Rgure 1 surwnarizeb the computer analysis io NA was examined by performing a Northern Mot w% : 

pr cjcedure for obtaii^c^ 200 to$00 ng of mRNA using a pror^ conplementary 

[0110] r^ure2£ io..ttW> : bligOTU^iwrtici^s, tag helore performing ttie fH^t 

amir^jeW^ strand synthesis described tn Example 2L ; 
mine the frequency 

tives iistngthetec^ 15 EXAMR-t2. . , , :;l ^\ ^ v : 

tion described herein ^ } ..-,\. . .. • ... • . < u ., : r. rv , /. , ; - v.; J.^KHV, 

£01111 Figure 3 iBustratesm cDN A Synthesis Using mRNA templates Having Intact 

edcDNAs. ' ' "* ' S'Ends ..... : . ■ ; : >.-- : r 

[01121 .-. r -FjgurQ ; 4 pfpvkJes- a schematfc descriptktt - t : ^ 

the promoter way they are asse/T^led V [0118] the mRNAs jofaed to o4io^uclec4icte tags, 

with to e coa^ ; • r . first strand cON A synthesis vy«s performed usir^^reV 

[0113] Figure 5 r|e^toes the transcription factor verse, 4 tran^ 

binding sites present in each of these promoters. . t , %l In onter to protect internal EcoRI sites -jo jie cDN A frorh : . : 

. . digestion at laW steps in the , p 

Detailed Description 61 the Preferred Embodiment 25 dC^ was used for first stra 

ofr^Ar^ahalkalinehy^ 

I. General Methods for Obtaining 5* ESTs derived was precipitated using isoprppanpt in order to eliminate 

from mRrMlwrm. intact 5* ends .1*. ,,. r residual primers, .' w^rfi^ 

.", ( , [01101 The second strand of the eDN^ was synth^; 

[0114] In order to obtain the 5* ESTs of the present 30 size<f with a Klenow fragment using a jKirner corre- 

invention, rnRN As with intact? ends must be obtained, sponding to the 5'end of the ligated oligonucleotide. • 

Example 1 below describes the preparation of 5* ESTs. Methylated pX^TP was also u^ed fo/ sea^ strancl syn- 

thes^in orp^r to protect internal EcoRI sites in the cQNA 
EXAMPLE 1 . , from olgest^ . 

. '.,,**', '..] :i , 3S [0120] Fp^wmg cDNA syn^ 

Preparation of mRNA clcn^ intp pBluWscript as de^ibepVin Sample 3 be- 

" " . • ... , . .. . _ lOW., . ..: .: , V: . : T • ^ • 

[0115] Total human RNAs or pplyA 4 RNAs derived, Ui : ; -;s ' /-^ 

from 30 different .t^ues were respectrye^ purdiase^ EXAMPLE 3 ^v. ^.r:-r- 

from U\a|^f^ ..^ Vl U • 

42 cONA iibrariesas desorB^be^^ r Ctohirig ofcDNAs derrvedf rom mRNA withiritact 5* ends 

f^had.been>o^ hto BlueScript ; 
guank$um ; thk)cyanate-phef^ ^ , ■ :^ : v- - / . 

(CrwrnczynisW ; Anatytkzti Bipctmns^ [01?J1 F^wing.secqnd stra^ syr^^ ^ ends 

1^:15(5-159, 1987).>f^AtRNAw^ & ofthecONAwerebUm^^^ 

tal RNA (U^MQ) r^ two^p o)f pl^o dT chroma-; ofeb$) and : the cONA was digest^ ^^£95^. Soj% ; 

togra^)hy,as descrir^ rnelliylated o^tP was used during cQNA synthe% ti>e ; 

Acad ScL «S4 ,^^1412,,1972y « order to efirni-.i Ecof? site present in the tag was the pn^^ 

nate nTx)sornaJ RN A, 1 . t ; , - atedsfte, hence tte only srte,s^^ 

[0116] The quality and the integrity of the pp?/A + M so g^tk)n. ThecDNA was thensize 

RNAs were cfiec^ Norttem wiftta t ? elusion chrpmatpgraphy ( AcA, jp^sepra) ar^ jracttons ; 

gtobm probe were us^to cc^f™ corresrxjnding to cONAs of rrwe ttian 15Q ^were: ; 

not deg^BdedL Con^nation of the potyA* mRNAs by pooled and ethanol precipitated The cf^^s tfrecr, 

rfeosomal sep^ences was checked using Npfthem blots tipnally cloned iitp me ^Smal and EcoFB ends of ^e 

and a probe derived from the sequence of the 28S rR- ; . ss r^hag^id pBiueScript ve^w (Stratap^ 

N/L f^epaiations of. rnF)N As wdih less th^5%of rmM rraxturewas electroporated^ 

were used in fibrary<»nstnjct^^ ed under appropriate ^antibiotic selec^.: j^^^ 

Ifcraries with RNAs contaminated by exogenous se- pi22j clones containing the oligonucleotide tag at- 
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tached were then selected as described in Example 4 
below. 

EXAMPLE 4 

Selection of Clones Having the Olkyxiucteotide Tag 
Attached Thereto . - - <•'•:, 

[0123] the ptasmid DNAs containing 5 1 EST Itoraries 
made as described above were purified (Qiageri). A to 
positive selection of the tagged clones was performed 
as follows. Briefly, in this selection procedure; the plas- - 
mid DNA was converted to single stranded DNA using 
gene II encV&uctease of the phage ' Fl in combination 
with an exonuclease (Chang et aL, Gene 127:95-8, '5 
1 993) such as exonuclease III or T7 gene 6 exonucie^ -> 
ase The resulting single stranded DNA was then puri- 
fied using paramagnetic beads as described by Fry et 
ai f Bk>techniques t \2\ 124-131T1992. tnthisprbceo^re, r 
the single stranded DNA was hybridized with a btoti- 20 
nytated oligonucleotide having a sequence correspohbV 
ing & the ? erid 61 trie oligonucleotide tag; Clones tri- 
cluolng a sequence cbmplementary to the biotiriylated 
oligonucleotide were captured by incubation with - 
streptavidin coated magnetic beads followed by mag- 25 
netic selection After capture of the 1 positive clones,' the 
plasmid DNA was released from the magnetic beads 
and converted into double stranded DNA using a DNA f 
rxrtymerase such as the TliefrrK^sequeT^se obtained V 
from^riers>rarriPta 30 
ed DNA was then eledroporated into bacteria The per- 
centage of positive clc^e^ having the 5* tag oligonucte- 
otide was estimated 'to typically rank' between 90 arid 
98% using dot blot analysis. 

[01i241 F6I towing electroporatton, the libraries were 3S 
ordered in S84-rhicrotiter plates (MTP): A copy of the* 
MTP'was stored for future heeds: Then the libraries 
were transferred irito ^S MTP and sequenced as deV : 
scrtoe&be^w. ; * - ^ '-* 

EXAMPLES " -■• ' ^ ^ ■■ c> 

Sequencing of Inserts in Selected Clones 

[0125] Ptasmid inserts were first amplified by PCR on 45 
PE-9600 thermocyclers (Perkin^lmer, Allied Biosys^ 
terns Division, Foster City, CA), using standard SETA-A 
arid SETA-B primers (Genset SA), ArnptiTaqGold fpet- 
'IdMEMerX-dNtPs' (Beehniiger), buffer and cycling con- 
ditions as recorrimended by the Peri^Bmef Corpora- • so 
tion. ' ' •■ ■•V-: : v Y*. '*" 1 ;"■ " "■ : ' : 

[01 26] PCR products were then sequenced using au- 
torriatic ABI Prism 377 sequencers (Perkin Elmer); Se^ 
quencing reactions were peHcnmed using PE 9600 ther- 
mocyclers with standard d^prirrter chemistry arid ss 
TheVmoSequenase (Amersham Pharmacia Biotech). 1 
The prirriers used were either T7 or 21MI3 (available 
from Genset S A) as appropriate. The primers were la- 



beled with the JOE. FAM, BOX and TAMRA dyes. The 
dNTPsand ddNTPs used in the s : sequencing reactions 
were purchased from Bc^nnger. Sequencing buffer, 
reagent concentrations and cycling conditions were as 
rec<xnrhehdedby Amersham. * - : 

[0127] * Foflowtng the seb^ericing reaction, the sanv : 
pies were precipitated with ethanol, resuspended tnfor- 
mamide loading buffer, and loaded on a standard 4% 
acrytam'tde gel. Electrophoresis was performed for 2.5 
hours at 3bOD\/ on' an ABI; 377 sequencer, and the se^ v 
quence data were collected and analyzed using the ABI 
Prism DNA Sequencing Analysis Software, version 
2.1.2. " ' ^ - ■ y '- v; '■ ■ V.- 

EXAMPLE 6 : 

Obtaining 5* ESfs frorri Ful^-leriqth cDNA Iforaries 
Obtained from rnRNA witfr Intact g Ends ~- 

[0128] Attematfvety, 5'ESTs may be isolated from oth- 
er cDNA or genomic DNA libraries. Such icDNA or jge-: 
nomic DNA libraries rriay be obtained frorri a commercial 
source or made using other tec^ 
skilled in the art One example of such cDNA library con- 
struction, a full-length cDNA library/ is as folloWs; J ^ 
" : [0129] > PdlyAVRNAs&re^ 
checked as described in Example 1 Then, the caps at 
the 5* ends of the poly A* RNAs are specifically Joined to 
an ofigcriucteotide - tag. -The oGgcnucteotide tag jftay 
contain a restricticri site such as Eco RI to facilitate fur- 
ther subcloning procedures Northern blotting is then 
performed to check the size fc4 mRN As haying the oli- 
gonucleotide tag attached thereto and to ensure that the' 
mRNAs were actually tagged. : -'^ 
[0130] First strand synthesis is subsequently carried 
out for rnRN As joined to the otigoriucleotide tag as de^ 
scribed m Example 2 above except that the rarxiom rion- ' 
ainers are n^lace^by an die^^ For instance,- 
this otigtMJT primer may contain ah interrial tag of 4 nu- 
cleotides which is different from one tissue to the other. 
Foltowirig second strand synthesis using a^rimef conK 
tairied h the oltgonucteotide tag attached to the 5* end 
of mRNA, the blunt ends of the obtained double strand- 
ed f utt-lehgth ONAs are modified into cohesive erids to • 
facilitate subcloning. For example, the extremities of 
full-length cDNAs may be modified to allow subcloning^ 
into the Eco RI and Kind III sites of a Bluescript vector 
using the Eob FH site of the oiigonuclept ide ;tag and the 
addition of a Hind III adaptor to the 3^ Jend of f| mU-length i*; ; 
cDNAs. 

[0131] The fufMength cDNAs are then separated into 
several fractions according to their sizes usirig tech- 
niques famitiar to those skilled iri the art For example; 
electropnoretic separation may be applied in c<der to ■ 
yield 3 or 6 different fractions; Following gel extiaction 
and purification, the cDNA fractions are subcfoned into 
appropriate vectors, such as Bluescript vectors, tens-; 
formed into competent bacteria and propagated under 
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appropriate antibiotic conditions. Subsequently, ptes- 
m'Kfe cwteining tagged f ulHength cDNAs are positively 
selected as d^cfft>e0 in Exarnple 4. . f ; ; 

[0132] The 5* end of fu»4ength cDNAs isolated Jfojn 
such cONA libraries may r then be sequenced, as de- . s 
scrfced in Example 5 : ... 

H.2. Computer Analysis of the Isolated 5' EST?: 
construction of ^ NetGene™ and SlgnaJTag^ 

ofctebases,. 10 

■ i ' . _ . ... 

[01 33] The sequence data from the 42 cONA libraries 
made as descrtoed above were transferred to a data- 
base, where quality control and validation steps were 
performed. A base-caller, working using a Unix system, & 
automatically flagged suspect peaks, taking Wq ; ac-v 
count the shape of the p^l^'th^ * 
and the noise level The proprietary base-calleralso per- 
fonned an automatic trimming. Any stretch of 25 brfew- ; 
er bases having mp/e than 4 suspect peaks was con- so 
sidered unreGable and cor- , 

resppnd^tp cjoning vector or ligation oligonucleotides 
were automatical removed from, the EST sequences. 
However, the resulting EST sequences may contain 1 
to 5 bases belonging |o the abp^ sequenc- 25 

es at their 5* end. If needed, these can easily be. re- - 
moved on a case to ^case basfe^ ■.■■■>:■ 
[0134] L Following sequencing as described above, the , 
sequences of the 6* ESTs, were entered in NetGeneT*, 
a database for storage and manipulation as described; &> 
below and as depicted in Figure 1 . Before searching pe 
ESTs in the NetGene™ database, tbrsec^errces of in- 
terest, ESTs derived from rr^As.whicb were not of «r ; 
terest, such as endogenous or exogenous, cc<itamF 
nants, redundant sequences, smaB sequences, hig^ry as 
degenerate sequences, or repeated sequences were : 
tdentffied arid efiminated from further constriction.. 
[0135] : In <Mdef:to determine the accuracy c^theser 
quencmg procedure as : well as the ;efficiency,of the 5* 
selection described abwe, tije analyses described in 40 
Examples 7 and 8 respectively ; were performed . on , 
5*ESTs obfcined f^ 

the elimination of sequences which were nc4 of ^ 'i^ 

EXAMPLE 7 , -,. • ;:....»-■. :V^ ] - 45 

Measurement of Sequencino Accuracy bvCompans^ ; 
to Known Sequences 

[0136] To furtr^er determine tte accuracy of thersor ; so 
quenctng procedures descrtoed in Example 5, the ,se- 
quences of NetGene 1 ? 5* ESTsoWived from taiown se~ 
quences were identified and compared to the c<igirial 
known sequences. First, a FASTA analysis with over-; 
hangs shorter than 5 bp on both ends was conducted ss 
on the 5* ESTs to identify those matching an entry in the 
public hurnan mRNA da^se. Trie 66^ 5* ESTs which 
matched a known human mRNA were then realigned 



with their cognate mRNA and dynamic prog/amnrortg 
was used to include substitutions, insertions, and dele- 
tions in the list of "errors" which would be recognized. 
Errors occurring in the test 10 bases of the 5* EST se- 
quences were ignored to ayotithe inclusion of spurious 
cloning sites in the analysis of sequencing accuracy 
[01 37] This analysis revealed that the sequences irv 
corporated in the NETGENE™* database had an accu; 
racv of; nwe than 99.5%. ... 

EXAMPLES ,: : 'y. r:: ^'\ 

Deterrnination of Efficiency of ^ESXSelection ; 

[0138] To determine the eff^^ 
selection procedures bolated 5* ESTs^^v^ 
sequences close to tiieS* end el me^fW^frp^n which 
they derived, the sequences of the end? of the ^* ESTs, 
o^rfve^fiorn 

heavy chain genes wer e compared^tto 
sequences of these genes, Since |lje trar^scriptton start 
sftesc4bothg^ may to; 

used to determine the percentage of ^ c^rivec;^ ESTs", 
which included the authentic tr^cripf^ > 
[0199] . For berth ages, more tr^ ^ 
5* ESTs actually included sequences ctose to or up- 
stream pf the 5' end of the c^esponding mR$As. 
[0140] r To extend the analysis of the reliability c* the ; 
procedures for isolating & ESTsfrpmESTs inthe htet- 
Gene™ cteta^e, asimilarana 
ing a database composed of human mf!^ 
extracted from GenBank database release 97 for corn- : 
parison. The 5' ends of more than 85% of S^ESls o£? < 
rived f rom mRNAs incluc^d in toeGeneBank database 
were located close to the 5| en^ of the known se- 
quence. As some of the rr^NA sj^uences avaflaWe in- 
the GenBank database : are deduced from generic se- 
quences, a 5' end matching with these sequences waj, 
be counted as an internal match. Thus, the method used 
here underestimates the yield of ESTs induding th^ au- 
thentic 5* ends of their correspond'oig mRNAs. 

EXAMPLE 9 

Clustering of the S* ESTs 

[0141] Since the cONA libraries at^ 
rnutUple J? ESts derived from the same mRNA, oye|tep- 
ping 5%STs :may be assem continuous se- 

quences. ThefoBowingmethod (see Figure l)descrft>es r 
rrow to efiicientty cluster 5'ESTs in order to yie^J not on^ 
consensus 5'EST sequences lor mRNAs derrved from 
different genes but also consensus 5'EST seo^iences 
forc^erent miRNAs, so called variants, tianscrtoed frqm 
the, same gene, ^ such as aftematively spfice^ rnrti^ v 
This clustereig was perfom>ed c« a set of NetjGene^ 
^ESTs sequences fortowing elirrunation of ewJogen^ 
contaminants, elimination of untnlormative sequences 
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and masking of repeals. 

[01 42] * the whole < set of sequences was first parti-: 
tioned into smaller sets, so-called dusters, containing 
sequences exhfcitiiig perfect matches with each other 
co a gwen length. Such clusters contain 5*ESTs derived 1 5 
f rom a small number of Afferent gene^ 
quences were 1 not clustered using this approach either 1 ' 
because they were not Homologous to any other 'se- 
quence or because the homology was not property de- 
tected. To overcome this problem, sequences not ctus- io 
tered, so called singletons, may be compared to the obrV 
sensus contigated ESTs obtained later on and, if nec- 
essary, induded In the appropriate clusters and used to 
compute other consensus contigated ESTs. 
[0143] : jijeoe were -is 

identified in each duster' as follows: Overlapping se- 
quences inside a given cluster Were figured as oriented ;, 
graphs where each sequence i ; was a node and each 
overlap an edge. Then, the different genes cbntamecT 
within a single g^Bph which were represented differ- • 20 
eht bonnex comments ^were Identified and' isolated 
from each other. Subsequently/ the different variants of " 
a same gene were isolated usiruj an algorithm based on 
the detection b? forks within a-oc^ex con^oheht If cfe- v 
sired; the consensus "c^tigatecf ; EST sequences ma^ 25 
be verified by identifying clones In nucleic acid samples 
derived from biological tissues, such as cDNA libraries; 
which hybridize to the probes based on the sequences ' 
of* the consensus ' contigated ESTs arid sequencing 

them/; "• 7 - : ■ ' - ^;\Vv:ro.--..30 

[0144] f^erlapping ^EST soquences betorigrtg 1 to 
the sarrie variant as well as included 5'EST sequences 
belonging to the sarrie cluster were then ooritigated arid 
consensus contigated 5'EST sequences were gerierat 1 . 
ed tor each variant. Some '> of the -obtained consensus as 
corrttgatecJ 5'EST sequences were' incomplete due* to 
the fact that xirity iriclucfed'arid overt 
queried were considered to" isolate genes and due to 
the algorithm developed toTtnd vara^ 
consensus ccritig^ 40 
as fdtows. Variants ^ v 
were oompared pairwise and the 5' EST cx»nsefisus > s€K ; 1 . 
quences that were incomplete either in 5' and/or in 3" - 
were extended with the appropriate sequence from ttW 
other variants. All 5' EST consensus sequences even-; 45 
tuafly completed in 5' or 3? from each duster were sub^ * : 
sequently compared to the whole set of individual 5'EST 
se^uenc^obtairied fw this cluster. : r\ K. . " : : : 

EXAMPLE 10 50 

lo^ttfication of the Mc^t Prctebie Open Reading l^'V- 
Frarhisof 5* ESTs 1 ^' - 

[0145] Subsequently, the most prcteble axfing ooen 55 
reading frame (ORF) rriay be determined for each con- 
sensus assembled 5EST or 5'EST as foflows. - . ~- = 1 
[0146] Each nucleic ackl sequence is first divided into 



several subsequences which coding propensity is eval- 
uated using different methods known to those skilled in 
the art such as the evaluation of N-rner frequency arid ' 
its variants (Rckett and Tung, Nucteic Acids Res;20: 
6441 : 50 (1992)) or the Average Mutual Infwmat ion 
method (Grosse et at, Interriatiorial Conference on In- 
telligent Systems for Molecular Biology, Montreal, Can- 
ada June 2&sJufy 1; 1 998). Eachof the scores obtained 1 
by the techniques described above are then normalized 
by their distribirtioh extre^ fused using a 

neural network hto a unique score that represents r the?- 
cc<i^rj probability 61 a given subsequence. • > " ' w o : 
[0147] r ihe/<^ scores obtained for 

each subsequent; thus the p^ 
obtained for each reading f rarne; are then linked to the 
tnfttattoh codc^ rxeserit cr» the sequence: For each. ; 
open reading frame, defined as a nucleic acid sequence' 
of at least 50 nucleotides beginning with an AT G codori, 
an Oft score is determir^ Basicatfy, this score is the 
sum of the p^obabflity scores cor^^ subset 
o^ience c^espo^ 

rect reading frame corrected by a function that negative- 
ly poriderates localry high saxe values ami positively 
ponderates sustained high score values: TTie chbseni 
ORF : is the one With' the highest score.- ' ; : % l /- J y 
[0148]' Two kinds bl ORFs are considered. In -some 
embodiments, 5*ESTs< encoding ORFs of at least! 50 
amino acids extending up to the end of the consensus ; 
assembled 5*EST sequences are obtained; In other em- - 
bodiments; 5^STs K enccx^g'cbm^ ORFs, narfiety' 
ORFs with [start and stop codecs, containing at least 10XK 
amino acids are obtained. , ; - ; - 

EXAMPLE 11 r.><--:: ; ^-w 1 ~ 

Sequence Analysis r ^ V- rt '* V 

[0149] Applicatkxi of the clustering metrKxides^ 
in Example 916 a Selected set of 126,735 NetGene^* 
5*ESTs free from endogenous cohtamfflants and tiniri- 
fonnatn/eseb^erices 

blect 5'EST seb^ences >f& ^ varia^ for a .total of ,8037; 
genes dustered Representing 98,973 irKjividual 5*ESTs. ? 
One^ of them which contained 21 , 1 38 sequences arid ) 
was shown to contain chimeras thanks to comparisori; 
to public sequences was removed from furmer analysis. 
[0150] Both nondustered5ESTs t ies^gJdc«s f and 
consensus contigated 5*ESTs were then comoared to 
already known sequences as fo06ws. TrK>se sequerices 
matching human mRNA sequences were 4 etirTi"inated ? 
from further 'analysis. Then, fpflc^ng rnasking of ^re-^ 
peats those sequences rriatchtng sequences that ^^have^ 
already been discovered by the inventors, namely se^ 
quences exhibiting more than 90% homology over 
stretches kxiger than 40 nudec4ides using BLAST2N 
with overhangs shorter than 10 nucleotides, were n> 
moved from further conskferation. The final set repreV : 
sents the sequences' of the invention (SEQ ID NOs: ' 
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24-4100 and£178-36681), Le., 7609 ; consensu contK 
gated 5^ST from 6398 clusters containing .31,267 
5*ESTs and 24^,972 singletons. , > ■ - ? , ..... 

[0151] Of ^£39§ obtained clusters, 658. were shown 
tobe muftrvaiiant, ie. to cxxitain several variants of the 
same gene. Table ; f gives for each of the muftrvariant 
dusters nameo* by rts internal reference (first column), , 
the list of the consensus sequences of ail variants* each 
variant being represented by a different SEQ ID |sKX ^ 
[0t 52] /: Subseo>ently. the most probable open reading 
frame was cfetermined; as o^ribed in Example \0, f or 
all sequences of Re invention. 3.697 ?E§1s (SEQ ID 
MQs:24^720) encoctog ^ GRFs (SEQ ID f 

NOs:4101r7797) of at; leas^ ^ am^ ackl long were . 
fCHJi^ In addition, 3805'ESTs (SEQ ID NOs:3721 ^100) 
encoding ccfnplete ORFs (SEQ ID NOs:7798^177) of 
atleasVIO^ammacto^ r,. 
(01631 The nucleotide seo^encos d the SEQ ID NOs: 
24-4100 and 8178-36681 and the amino acid sequence 
es encoded by SEQ ID HQs: 24^100 (te r am^ add 
seo^encesof SEQ ID Nr^c4i01^177) ar.e provided in 
the appended sequence fisting. Some of the amino acid 
sequences may contain r^Caa" designators. These "Xaa* 
designators hdicateveither : (1) a residue whk^.cannpt: 
be identified because of nuclei sequence ambiguity 
or,(2) a stc^ cooon in the determined sequence where 
applicants believe one should not exist (if the sequence 
were determined more accurately). 
[01S4] v If oneof the nucfeic acM sequences of SEP ID 
NOs: 24^100 and 8178-36681 are. susfjected of con- 
taining one or more incorrect or ambiguous nucleotides, { 
the ambiguities can readily be resolved by resequencing 
a fragment containing the nucleotides to be evaluated. 
If one or more incorrect or ambiguous nucleotides are 
detected, the corrected sequences should be inducted 
in the dusters from which the sequences were isolated, 
and *ised to compute ctfher (consensus ^ 
quench on which otherORFs would be identified , Nu-. 
deicacid.fragments for resolving sequencing errors or 
ambiguities may be obta^ed from deposited clones or 
can be isolated usi^ the Jech^ 
Re^fcrt'w of any such am^ rnay befar - 

dStatedby itsing pnmers w^ to sequences 

located close to the antoiguous or erroneous sequenc- 
es: For example, the prirners may hybridize to sequenc- 
es within 50-75 bases of the arhbiguity or error Upon; 
re^utkxi^ of an error or ambiguity, the corresponding 
corrections can i be made in the protein sequences erv 
coded by the DNAcontain'mg meeiTCfc< ambiguity The 
amino acid sequence of tie protein encoded by a par- 
ticutar done can also be o^ermined by expression of 
the clone in a suitable host celt collecting the protein. ; 
and ctetennining fts sequence. 
[0155] /-. In adoption, if one of the sequences of SEQ IQ 7 
NOs: 4101-8177< is suspedeoVof containing an truncat-: 
ed ORF as the result of a f rameshrft in the sequence, 
such frameshifting errors may be ejected by combin- 
ing the following two approaches. The first one involves 



thorough examination of all double predictions, i.e. all / 
cases where the probability scores for two Of$F$ located 
on different reading frames are high and close, prefer* . 
ably afferent by less than 0.4. Trie fine examination of . 
s the region where the two possible ORFs overlap may 
hetp to detect the frameshift In the .second approach ; 
tomplogies with known j^pteir^ are used tocorrect sus- 
pected frameshifts. . v 

10 EXAMPLE 12 ,/ : . 

Identification of Potent'tad Signal Sciences in ff ESTs 

[0156] Jhe arnirw) acid sequenced of SEQ ID >IOs 

1$ 410t ^177 were then sear 

nal motifs using slight rrxxJifrat kxis bf the procedures 
tfsdqsed in Von I Hejtjne, WwcfteA? . , Ac*fe \jRes y : s J£,, 
4683-4690,1986. Those sequences encoding a 1 5 ami- 
no acid long stretch with a soora ^ V , 

20 \fon Heijne signal peptide io^tiTicar^ con- 
sidered to possess a signal sequence and , were tncjudk 
ec(« ^database c^ 

(0157]: The sequences of the j^nudeic add se- 
quences containing a signal sequence fl$£Q ID NOs: 
2S 24^2 and3721r3811 ) andthe c^ polypep- ; i 

tides with a /potential s^ 

4101-472? and 7798-7888) are provided in >#)e ,Se-,. 

quence Listing appended hereto. The signal peptides of 

such polypeptides are indicated as features in the ap- 1 
30 pended Sequence Listing. It should he noted that, in ac-r^ 

cordance with the regulations Governing Sequence List- ; 

ing^, In the appended J^uence Listrng, trm fulf protein. • 

(i.e./ the protein corttahing the signal j^X^ a^^ 
•, rreture protein) extends fr^ acW residue 

35 haying a negative number, throu^a positively num- . 

bered C^ernT^ amw 

no acid of the mature prpjteri resulting from cleavage of 
the signal peptide is designated as amino acid numher 
VaixJ the first arr^ ac« of the sjgnal peptidejs | iies- 
40 ignated with the, appropriate negat^^ 
[01581: Jo corifirm the acx^racy of ^e^a 
for identifying signal seo^enc^s, the analysis d Dcam- . 
pie 13 was performed. , . <,. .•; i>,:- : ^ ; k , 



45 EXAMPLE 13 



Confirmation of Accuracy of Identification of Potential 
Signal Sequences in 5* ESTs 

so [0159] The accuracy of the above procedure for iden- 
tifying signal sequences encoding signal peptides was 
evaluated by applying the rnethod to the 43 iaminp adds 
located at the N termhus of aH human SwissProt pro^ 
. teins.The cxxr^>uted Von Heijne scprejor each ptoteiri, ; 

ss was compared with; the known cferacterization of the 
protein as being a secreted protein or a r^^secreted 
protein In this manner, the number ot non-sweted pror ■ . ■ . 
teins having a score higher than 3.5 (false positives) and 
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the nurnber of secreted proteins having a score lower 
than 3.5 (false negatives) could be calcuiated. 
[0160] Using the results of the above, analysis, the 
probability that a peptide encoded by the 5' region of the 
mRNA is in fact a genuine signal peptide based on its $ 
Von Heijne's score was calculated based on either the 
assumption that 10% of human proteins are secreted or 
the assumption that 20% erf human proteins are secret- 
ed. The results of this analysts are shown in Figure 2. - 
[01 61] 'I Using [the above method of identiftcatioh of se- to. 
cretory proteins, 5" ESTs of the following porypeptides 
known to be secreted were obtained: human glucagon, 
gamma ftterf eroh induced monokine precursor, secret- 
ed cyclophitn-like protein, human pte'otropin, and hu- 
man biotinidase pfecursori ' Thus; the above method : is 
successfully identified those 7 "5* ESTs which encode 5 a 
signal peptide: 

[0162]' To confirm that the signal peptide encoded by 
the '5* ESTs" or contigated consensus 5* ESTs actually 
functions as a signal peptide, the signal sequences from 20 
the 5* E$Ts dr'doh^^u^^^STs'mgy^be' cloned into a 
vector designed for the idehtification of signal peptides. ' 
Such vectors are designed to confer the ability to grow 
in selective meolum only to host cells containing a vector 
with ah operabty linked signal sequence For example, " : 2s 
to confirm that a 5* EST or consensus 5 1 EST encodes 
a genuine signal peptide, the signal sequence of the 5' 
EST or consensus 5'^ EST may be ■ inserted upstream 
and in frame with a noh-secreted form of the yeast in- 
vertase gene in signal peptide selection vectors such as 30 
trk>sebN3S^ 

of host cells containing sigml sequence selection vec- 
tors with the correctly inserted 5* EST or consensus 7 5* 
EST signal seo/ierx^ confirms that the 5* EST or con- 
sensus 5* ESTs erKx»des a genuine signal peptide; 35 
[0163]? ^Alternatively, the presence of a signal peptide 
may be (xxtfrmed by clcoirtgtfie extended cDNAs otK 
laihedus^g the ESTs or consensus 5* ESTs intoexpres- v 
skm vectors such as pXT1 as described below, or by 
constructing prorroter-sigriai sequence-reporter gene ' 
vectors which encode fusion proteins between the sig- 
nal peptide and an assayable reporter protein, After in- : 
trootetkxiof these vectors tntoa suitable host cell, subh 
as COS cells or NIH 3T3 celts, the growth medium may 
be harvested and analyzed for the presence of the se- 45 
creted protein. The rneolum from these -cetis is conv 
pared to the medium frorfi control cells containing vec- 
tors lacking the signal sequence or extended cDN A tor 
sert to identify vectors which ericoo^ a functional signal 
peptide or an authentic secreted protein. •*. ^ so 

EXAMPLE 14 ^ *^ -\ y.;vi.; : - 



Assessment of the novelty, rate of 5'ESTs * 

[0164] To assess the yield of new sequences, the ofc^: 
taJned 5"ESTs"and consensus contigated 5*ESTs werei 
compared to all known human mRNAs extracted from 



55 



the EMBL release 57 and daily updates available at the 
time of filing. The comparison was performed using 
BLAST2N on both strands following masking of the re- 
peats: Sequences having more than 95% hctrnology 
with public sequences over their' whole length with at 
most 10 nucleotide overhangs on each extremity were 
considered as previously identified. Thus, about 90% of 
5'ESTs or consensus assembled 5*ESTs were cohsic- 
ered unidentified: ^ • ■ , - > 

II. 3. Evaluation of Spatial and Temporal Expression 
of mRNAs Corresponding to the 5'ESTs or Extended : 
cDNAs .-■ . - •■ :>\ - 

[0165] Each of the SEQ ID NOs: 24-4100 and- 
8178-36681 was also categorized based on the tissue: 
from which its corresponding mRNA was obtained, as 
descrfced below in Example 15. * . 

EXAMPLE15:i . ,;' ? .' T ■ v 

Expression Patterns of mRNAs From Which the 5'ESTs 
were obtained •*.-■■■■)■•..: ■;...■.<: '^"'V 

[0166] - Table II shows the spatial distribution of each*; 
of the 5'ESTs (non-ctustered ESTs) arid of each conserve 
sus contigated ESTs respectively- Table II provides the 
SEQ ID-NC3s:-'Ortfie^'€STs:-(refefred- to alternatively : 
herein as nonk:lustered ESTs. or singletons) and con- 
sensus contigated ESTs: Table II also lists the number 
of ESTs from each type of tissue which were used to 
assemble the contigated consensus ESTs. The SEQ tO 
NOs: in Table II which contain a single 5' ESTrt rem a 
single tissue 'are 5' ESTs: ! Each type of • tissue .listed in * 
Table II b encoded by a letter. The conespondence beP 
tween the letter code and the tissue type is given in Table 
Hi: ^Fc* example, the consensus <cc<itigateo\EST c*SEQ 
ID NO: 47 contains one 5*EST from cancerous prostate,? 
two 5'ESTs from lymph ganglia, and two 5'ESTs from 

testes .^^t.. ,V v ••■■■"■^.-:>. :■: r-V. /..•*:„ >?• ' .\y I'.-.-v 

p)167I tnaotfrtfontocategoir^ 
sehsus contigated 5* ESTs with :rdspect< tbth^.tte&W'ol: ' 
cMigin, the spatial aml tempore 
the mRNAs ccoesponding to the S^ ESTs arid consen- r 
sus contigated 5* ESTs; as well as theinexpressiori lev- 
els, may be determined as described in Example 16 be- 
low ■' - • : : ,s -r:.vj- ..<&x r ■■1,..:% -■i.-Oiy./- 

[0168] Characterization of trite spatial a^ 
expression patterns and expression levels of these mR- 
NAs is useful for constructing expression vectors capa> 
ble of producing a desired level of gene product in a de- 
sired spatial or temporal manrier, as win be<fiscussed 
in nrK>re detail below- >; . <■'- ;: t-*K£ : .: 

[0169] Furthermore, 5' ESTs r and consensu 
ed 5* ESTs whose ^responding mRNAs are assoctat- : 
ed with disease states may also be identified. For. ex- 
ample, a particular disease may resufi from the lack ol 
expression, over expression; or under expression of a , 



17 



33 



EP 1033 401 A2 



34 



mRNA correspontfng to a 5* EST or consensus conti- 
gated 5- EST. By conparing mRNA expression patterns 
and quantfties in sarnprfes taken from heatthy individuals 
with those from individuals suffering from a particular 
disease, 5* ESTs or consensus contigated 5*- ESTs re- 5 
sponsfoSe for the disease may be identified. , 
[0170] It wiU be appreciated that the results of the 
above characterization procedures for 5* ESTs and con-, 
sensus contigated 5* ESTs also apply to extended cD- 
NAs (obtainable as p^ribed below) which contain se- w 
quench adjacent to the 5* ESTs and consensus conti- 
gated 5* ESTs. it will also be appreciated that if desired, . 
characterization may be delayed until extended cDNAs 
have been obtained rather than characterizing the 5* 
ESTs or consensus contigated 5* ESTs themselves. 

EXAMPLE 16 

Evaluation of Expression Levels and Patterns of r- : f 
mRNAs CoffesoondinQ to EST-Related Nucleic Acids 

[0171] Depression levels and patterns of mRNAs cor- 
responding to EST-retated nucleic acids may be ana- 
lyzed by solution hybrkfization with long probes as de- 
scribed m International Patent Application No T : Wb 
97/05277: Briefly, an ESTtrelated nucleic acid, fragment 
of an €ST related nucleic acid, positional segment of an : 
EST<elated:nucletc acid, or fragment of a positional 
segment of an EST-related nucleic acid corresponding 
to the gene encoding the mRNA to be characterized is 
inserted ata cloning site immediately downstream of a 
bacteriophage (T3. T7 or SP6) RNA polymerase pron 
motor to produce antisehse RNA: Preferably; the EST- 
related nucleic acid, fragment of an EST related nucleic 
acid; positional segment of an EST-related nucleic acid, 
or f ragmerit of a positional segment of an EST-related 
nucleic acid is 1 00 or more nucleotides In length. The 
ptasrmdis linearized and transcribed in the presence of 
ribonucleotides comprising modified ribonucleotides (L 
e. blotimUTP arid DIG-UTP), An excess of this doubly h 
labeled FWA is hybricfeed 

ed from cells or tissues of interest Theihybriplzatforts 
are perforrnedi under standard stringent conditions 
(40^50*C for 16 hours in ah 80% f ormamkJe. 0.4 M NaCl 
buffer; pH .7-8). The unhybridtzed probe is removed by 
digestion with rtbonucleases specific for single 
RNA (Le: RNases GL3, T1, Phy M»- U2 or A) The pres- 
ence of the bkrtir^TP modification enables capture xrf 
the hybrid oh a mk^ftration jp>late coated with strepta- 
vidin. The presence of the DIG rncKfification enables the 
riybrid tooe otetectedandquantffied^ 
anti-DIG antibody coupled to dkatine pliosphatase. 
[0172] ;The EST-related nucleic acid, fragment of an 
EST related nucleic acid, positional segment of an EST- 
related nucleic acid, orfragrhenl of arjosfflonal segrrient ss 
of ari t STHrelated nucleic acid may also be tagged with 
nucteptide sequences for the serial analysis of gene ex- 
pression (SAGE) as disclosed in UK Patent Application 



No. 2 305 241 A. In this rriethod, cDNAs are prepared 
from a cell, tissue, organism or other source of nucleic 
add for which gene expression patterns must be deter- 
mined. The resulting cONAs are. separated into two 
pools. The cDNAs in each pool are; cleaved with a first 
restriction endonuclease, cafled an arK*ioring enzyme, 
haying a recognition sfte which is likely to be present at 
least once in most cDNAs. The fragments which contain 
the 5* or 3* most region of the cleaved cDNA are isolated 
by binding to a capture medium, such as streptaykfin 
coated beads. A first oftgonudeotide linker having a first 
sequence for hybridization of an amplification pnmer 
and ah internal restriction site for a so called tagging 
endonuclease is ligated to the digested cDN As in the 
first pool: Digestion with tlw second endonuclease pro- , 
duces short tag fragments from the cDNAs. 
[0173] A second oligonucleotide fiaving a second se- 
quence for hybridization pi an arnplification primer and 
an internal restriction site is ligated to the digested cD- 
NAs in the second poc4.;TrwcPNA fragmerits in the sec- 
ond pool are also digested with U>e tagging © 
ase to generate short tag fragments derived from the 
cDNAs in the second pool. The tags resulting from di- 
gestion of the first and second pools with the anctoring 
enzyme arid the tagging endonuclease are ligated to 
one another to produce so called ditags. In some em- 
bodiments, the ditags are concatamerized 
Ggation products cpntakiirig from 2 to 200 dftags. r The 
tag sequences; are then determined and comp^reoVto 
the sequences of the EST-related nucleic acid, fragment ; 
of an EST related nucleic acid, positional segment of an 
EST-related nucleic acid, or fragment of a positional 
segment of an EST-related nucleic acid to determine , 
which 5* ESJs, contigated consensus S ESTs, or ex- 
tended cDN As are expressed in the cell, tissue, organ^ 
ism, or other source of nuclejc acids from which 
were arrived. In mis way, the egression pattern of the 
RESTs, contigated consensus 5* ESTs, or extended cf^ 
NAstn the cell,, tissue, org^ism, or other source of nu- 
cleic acids is obtained. 

[0174J QuantitauVe analysts of gene expression may 
also be performed using arrays. As used herein;, the; 
term array means a one dimensional, two rfinriensiohal, 
or multidimensional ar^ of EST-reteted nuclejc 

acids, fragments of EST related nucleic ackte, p^itk>nal 
segments E$T-retated nucleic acids, or frag/neritsof por 
sitional segments of EST-related nucleic acids. Prefer? 
ably, the EST-related nucleic acids, fragments of EST 
related nucleic acids, positional segments EST-related 
nucleic acids, or fragments of positional segments of 
EST-related nucleic acids are at least 15 nucleotides n 
length. More preferably, the EST-related nucleic adds, 
fragments of EST related nucleic aefcte;pp^~^ 
ments EST-related nucleic acids, or fragments of posi- 
tional segments of EST-related nucleic acids are at least ; 
100 nucleotide long. More preferably, the fragments are 
more than lOO nucleotides in length. In sorr^ ennbotfi-; 
ments, the EST-related nucleic acids, fragments of EST 
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related nucleic acids, positional segments EST-related 
nucleic acids, or fragments of positional segments of 
EST- related nucleic acids may be more than 500 nucle- 
otides long. . ' 

[0175J For example, quantitative analysis of gene ex- s 
pression may be performed With EST-related nucleic ac- 
kfe; fragments of ErST^ related nucleic acids; positional 
segments EST-related nucleic acids, or f ragments of po- 
sitional sec^ehts of EST-retated nucleic acids in a com- 
ptementary DNA microarray as described by Scheria et io 
at: (Science 270:467-470; 1905; Pioc/NaiLAcad. Set 
(jJsjL :"93: 13614-1061 9, 1 996) E€ft-f elated nucleic ac- 
ids/ fragrrehis of EST reiated nucleic acids, positional 
segments EST-related nucleic acids, or fragments of po- 
sitional segments of EST-related nucleic acids are am- is 
plified by PGR and arrayed from 96-welI nrifcrotiter plates 
onto silylated microscope slides using high-speed ro- 
botics/ Printed arrays are ffK^batedh a hurnid chamber 
to allow rehydration of the array elements and rinsed, 
once in 0.2% SOS for 1 rhin, t^ice ki water fctfl min and 20 
once for 5 min in sodium borohydride solution. The at 1 
rays i dre submerged in water for 2 min at'95°C, trans- 
ferred into 0.2% SDS for i rntn, rinsed twice with water, 
air aV^ and stored in the dark at 25°G; 1 ' 

[0176] ^ Cell or tissue mRNA is isolated or commercial 55 
fy obtained and probes are prepared by a single round 
of reverse transcription. Probes are hybridized to 1 cm 2 
microarrays under a 1 4 x 1 4 mm glass coverslip for 6-1 2 
hours at 60*C; Arraya&re ^ 

low stringency wash buffer (1 x SSCyq.2%^SDS)^then 30 
for 10 min at room temperature in high stringency wash 
buffer (0. 1 x SSG/0-2%'SDS)J Arrays are scanned in 0.1 
x SSCkisihg a fluorescence laser scanning device fitted 
with a 6ustom filter set. Accurate differential expression 
measurements are obtained by taking the average of 35 
the rattbs"bf two indepeno^t hybrkJizatic<is. 
[01771 r CK^ the expression of 

genes may also be perforrned with EST-related nucleic 
acia^JIra^fnents of EST related nucleic acids, pdsttonal 
segments EST-related nucleic acids; or fragments of po 1 40 
shiohal segments qff EST-related nucleic ackte in corn^ " 
plementary DNAarrays as described by Pietu etal (Ge- 
nome Research 6:492^503, 1 996). thi& EST-related riu- ? 
cleic acids, fragments of EST related httcleic ackis, po- 
sitional segments EST-related nucleic acids, or frag- 45 
merits of positional segments of EST-related nucleic ac- - 
ids thereof are PGR amplified and spotted on mem- 
branes; Then, mRNAs on^rtating from varicHJS tissues 
or cells are labeled with radkxtctive nucleotides. After 
hybrkflzation and washing iri controlled pb?^ so 
hybrkfized mRNAs are detected by phosphd-iniaging or 
autorao^ography. DupBcate experiments are performed 
arid a quaritHatrve analysis of differentially expressed 
mRNAs is then performed. 

[0178] Anematrvery, expression analysis of the EST- ss 
reiated nucleic acids, fragments of EST reiated nucleic 
acids, pc^itibnalse^rnents EST-related nudetc acids, or 
fragments of positional segments of EST-related nucleic 



acids can be done through high density nucleotide ar- 
rays as described by Lockhart et at. (Nature Biotechnol- 
ogy iA: 1675^1680,- 1996) and Sosnowsky et at (Proc: 
NatL AcaASd. 94:1119-1123, 1997), Ofig^ucleotides 
of 15-50 nucleotides corresponding to sequences of 
EST-related nucleic acids, fragments of EST related nu- 
cleic acids, positional segments EST-related nucleic ac- 
ids; or fragments of positional segments of EST-related 
nucleic acids are synthesized directly on the chip. (Lock- 
hart et at , supra) or synthesized and then addressed to 
the ch|p (Sosnowsky et al\ supra). Preferably, the 06- 
gonucleotides are about 20 nucleotides Si lengthi ; 
[0179] cDNA probes labeled with an appropriate cprrv 
pound, such as biotin, digoxigenin or fluorescent dye, 
are synthesized from the appropriate mRNA population '•■ 
and then rarKlbrnly fragmented to an average size of 50 
to 100 nucleotides. The said probes are then hybridized 
to the chip. After washing as described in Lockhart et at, 
supra arid application of different electric fields 
(Soriowsky et al supra), the dyes or labeling corii- 
pounds are detected and quantified. Duplicate hybrioi- 
zations are performed. Comparative analysis of the in- 
tensity of the signal originating from cDNA prbbe^'orr 
the same target oligonucleotide in different cDNA sam- 
ples indicates a differential expression of the mRNA cpr-_ 
responding to the* EST; consensus contigated 5' EST 
or extended cDNA from which the bfigcfiudeo^iderse^; 
quence has been designed. -■ J 

III. Use of 5* ESTs to Clone Extended cDNAs and to ^ 
Clone the Corresponding Genomic DNAs 

[0180] Once 5* ESTs or consensus con tigated 5* ESTs 
which include the 5* end of the corresponding mRNAs 
have been selected using the procedures described; 
above, they can be utilized to isolate extended cDNAs 
which contain sequences adjacent to the 5* ESTs or con- ■ 
tigated consensus 5* ESTs. The extended cDNAs may : 
include the entire cooing sequence of the protein encod- 
ed by the corresponding mRNA, including the authentic ; 
translation start site. If the extended cDNA encodes a 
secreted protein, it may contain the signal sequence, 
and the sequence encbeftng the mature protein remain- , 
sig after cleavage of the signal peptide; Extended cD- 
NAs which include the entire coding sequence of the 
protein encoded by the c»nesporiding mRNA are re- 
ferred to herein as full-tength cDNAs/ Altemativery, the : 
extended cDNAs may 'not include the entirecoolng se- 1 
quence of the protean encoded by the corresponding 
mRNA, atthcwghtheydoircludese^ 
the S^STs or contigated consensus .5* ESTs. In some ; 
embodiments in which the extended cON As are derived ; 
from an mRNA encoding a secreted protein, the extend- . 
ed cDNAs may include only the sequence encoding the 
mature protein remaining after cleavage of the signal 
peptide, or only the sequence encoding the signal pep- : 
tide. ■ - . • 

[0181] Example 1 7 below describes a general method 



19 



37 



EP 1 033 401 A2 



38 



lor obtaining extended cDNAs using 5[ ESTs oc.cxjnsen- 
sus contigated 5 ESTs Ti Example 28 below describes 
the cloning and sequencing ofseveral extended cDNAs,, 
including extended cDNAs which include $he entire pod- 
ing sequence and authentic S end of the coaesponding s 
mBNA for several sweted proteins. .. . 
[0182] Trierrieth^of 17 and 18 can also 

be used to obtain ertenided 
than the entire, coding i sequence ofp^ 
trie genes cpn^^ 10 
contigated £STs. In sorne errt>od^ents f the extended 
cDNAs is^ted iising the^ rr^ encode at least; 
5.10.15,20,25,30, 35, 4Q, 50,75,1 00, or 1 50 consec- 
utive amino acids of one of tfie proteins encodedby the 
sequences of SEQ JD NOs: 24-4100 and 8178^36681, .. ^ 
In some embedments, ^extended cONAs isolated 
usii^ these methods encode at least 5, 10- v1 5. 20, 25. 
30, 35, 40, 50,75, 100, or 1 50 consecutive amino acids 
of one of the proteins encoded bylhe sequences of SEQ 
IDNQ$:?4-4109. : : 20 

EXAMPLE 17 : ^ 

General Method for Using 5V£STs to Clone and 
Sequence Extended cDN As which Include the Entire 25 
Coding Region arid the Authentic 5*End of the 
Corresponding mRNA 

[01 83] r T^e following general method has been used 
to quickly and evidently isolate extended cDN As iriciud- &> 
ing sequence adjacent to the sequences of the 5* ESTs 
used to obtain them. This method may be applied to ob- ; ; 
tain extendedicDNAs for any 5* EST or consensus con- 
tigated 5^ EST of the invention, including those 5* ESTs . ;• 
and consensus ccotigated 5* ESTs encoding secreted as 
proteins. This rriethod is summarized in Figure 3. 

1. Obtaining Extended cDNAs ■ 

a) Fast strand synthesis^ ^ - ; .- .-, ? 40 

(0184] The me^od takes acJvantage of the known 5* 
sequerwteof themFWA. Ar^ reaction 
is conducted on purified mRNA with a poly dT primer 
containing a raiclec4kie.sequenco at its & end allowing 45 
the addition of a kr>own sequence at the end ot the cQNA 
which corresponds to the 3* end of the mBNA. Such a 
primer .and a cormierciaPynavailable reverse tranr ; 
scfjptase enzyme are added to a buffered mRNA sarn- 
pteytekJing a reverse traracriptanch^ 50 
site of the FWAs. Nucleotide monomers are then added 
to complete the first strand synthesis. 
[0185] After removal of the mRNA hybridized to the 
first cDNA strand by alkaline hydrolysis, the products of 
the alkaline hydrolysis and the residual poly dT primer 55 
can be eliminated with an exclusion column. 



b) Second strand synthesis 

[0186] A pair of nested primers on each end is de- 
signed based on the known 5* sequence from the 5' EST 
or contigated consensus 5' EST 3nd the known 3* end 
added by the poly tfT primer used in;the first strand sypr, 
thesis. Software used^ 

on GC content a^ melting temperatures ^o^ pfigonucte- , 
otides, such as OSP (lliier and preen, KR^th^AppL 
1:124-128, 1991), or based on the octamer fre^emy 
cfisparity method (Griffajs etaL, Nucleic Ackfs Res. 19: 
3887^3891, 1991 such as 
ics.weizmann.ac;it/softwatf^ 

html) ;.. ; .\ - . : . v s 

[0187] , preferably, the nested primers at to 
the nested primers at the ^ e/id £tfe separated fro^ 
another by four to nine bases. Jhese prirner seo^ences 
may Iw selected to teye mett 
cifictties suitable tor use in PCR. r : . v 
[0188] A first PCR run is performed using ^pirter 
primer from each of the nested pairs. A second P^run 
is performed using the same enzyme and the Diner prim- 
er from each of the nested pairs is then performed on a ■ 
smalt sample of the first PGR product; T^ereafter^ttte , ; 
primers and remaining nucleotide monomers are re- } 
moved. .-. --n-.;-.. -v-^ . 

2. Sequencinq of Full Length Extended cDN As or 
Fragments Thereof ^ . : 

[0189] Due to the lack of position constraints on the / 
design of 5! nested primers compatible for P£R use us- ; 
ing the OSP software* ampficons of two ; types are obr 
tained. Preferably. ^eisecc^,5Vprimer fefcx^ecj.up- 
strearn of the traralatwn,^ 

nested PCR product contain^g ithe e^ coding se- 
quence. Such a fuB length extended cQ^Amay be used 
Ei a aired cloning procedure. However, \n sorne cases^ 
the second 5* primer isr located downstream of Ae trans- : 
lation initiation ccdon, thereby yieWing a Pf^ product 
containing only part of the ORF^Sucti,^ : 
products are submitted to a modified procedure de- 
scribed in section b betow. - „ 

a) Nested PCR products containing complet e ORFs 

[0190] s Wieri the resulting nested PCR proq^eon- v 
tains the co^ 

the 5*EST w consensus ccfitigate^ 

is ctoed-in an appropnate vector. ; \ v > 

b) Nested PCR products containing incomplete ORFs 
[0191] v When trie amplic^ 

plete coding sequence, tnteimediate steps are neces- 
sary to obtain both tte complete coding sequence and; \ 
a PCR product containing the full coding sequence. The 
complete coding sequence can be assembled from sev- 
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eral partial sequences - determined directly from different 
PCR products; 

[01 92] Once the full coding sequence has been com- 
pletely determined, new primers compatible for PCR 
use are then designed to obtain ampticons containing 
the whole coding region. However, in such cases, 3* 
primers compatible for PCR use are located inside the 
3f ITTR of the ednespondirig mFWA, thus yielding anv 
r^icc^s which lack part of this region. Le. the poly A tract 
and sometimes the poryao^nyjatkiri signal, as illustrated 
in Figure 3/ Such fuB length extended cDNAs are then 
cloned into i an a^rofirfete vector. 1 

c) Sequencing ertended bDN As 

[0193] Sequencing of extended cDNAs can be per- 
formed using a Die Terminator approach with the Arnpl- 
iTaq DN A polymerase FS kit available from Perkin Elm- . 

[0194] lif order ^6 sequence ^ F^R fragments; prjrrter 
walking Is i perfomfed using software such as OSP to 
choose primers automated computer software such . 
as' r ASMG (button t e/ a7/i^nome Soenoo TechnoL i:'- 
9^19, 1995) to construct cohtig^ walkhg sequetees 
including this initial i? tag using minimum cVertaps of 32 
nucleotides. Preferably, primer walking is performed un- 
til foe secjuero 

3: Cloning of Full Length Extended cDNAs .' : ' v ' 

[01 95] ' The PCR product ^tathing the full coding se- 
quehce is then cloned in ah a^rcpriate yectorl For ex- 
ample, foe 1! extended cDNAs can be cloned into any exV 
pfession vector kr^wn in the art v: / • : 
[0196] Since the PCR products obtained as deserted 
above are blunt ended molecules that can be cloned in 
either direction, the orientation of several cloned for 
eaich PCR product is determined- T^eri; 4 to 10 clones 
are ordered in microliter plates and subjected to a PCR 
reaction using a first primer kx^ated in the vector dose 
to the cfcximg site arto a second primer 1 located irY the 
portion of the extended cDNA cwrespono^g to tn^ 
end oi trie mRN A. TTiis second prWer rMy be the* antH " 
sense prirner used in anchored PCR in the case of direct . 
cloning (case a) or the antisehse prirrer located inside - 
the 31FTR in the case of indirect ckxiing (case b). Clones 
In which the start codon of the extended cDNA is oper- 
ably Bnked to the prortioter in the vector so as to permit 
expression of the protein 'encoded by the exlencted cO- 
NA are conserved and sequenced. In addition to the 
enoVcicDNA inserts; appVoxirnately 50 bp of vector 
ONA on each side of the^ 
quenced. 

[0197] Cloned PCR products are then entirely se^ 
quenced in order to obtain at least two sequences per 
clone. Preferably/the sequences are obtained frorri both 
sense and anttsense strands according to the afon> 
mentioned procedure with the following nrVxfifications. . 



First, both 5* and 3* ends of cloned PCR products are 
sequenced in order to confirm the identity of the clone.-/ 
Second, prirner walking is performed if the full coding 
coding region has not been obtained yet. Contigatbh is 

5 then performed using primer walking sequences for 
cloned products as well as walking sequences that haw 
already contigated for uhcloned PCR products: The se- 
quence is considered complete when the resetting con- 
tigs include the whole coding region as well as overlap-' 

10 ptng sequences with vector DNA oh both ends. All the 
contigated sequences for each cloned arnplicoh are 
then used to obtain a consensus sequence. 

4. Selection of cloned full length sequences obtained ; 
is from the 5' ESTs of the present invention r ' ■ ■ 

[0198] A negative selection may be performed in or-: 
der to eliminate unwanted cloned sequences resulting 
from either dontaminarits or PCR^artifacts i as fcllbyrc. 

20 Sequences matching contaminant sequences such as ■ 
vector DNA, tRNA, rhtRNA, rRNA sequences are c5s- 
cardecl as well as those encoding ORF sequences ex- 
hibiting extensive homology to repeats. Sequences ob- 
tained by direct cloning using nested primers on 5' and 

25 3* tags (section 1 . case a) but' lacking polyA tail may be 
discarded. Only ORFs containing a signal peptide arid 
ending either before the pofyA tail (case a) or before the 
end of the cloned 3*UTR (case b j may be selected. ; 
Then, ORFs containing unlikely mature proteins such; 

30 as mature proteins which size is less than 20 amino ac- 
ids or less than 25% of the immature protein size may 
be efirhinated. " 

[0199] Then, for each remaning full length extended 
cDNA containing several OFtFs, a preselection of ORFs f 
35 may be performed using the following criteria. The long- 
est ORf with a signal peptide is preferred. II the ORFr 
sizes are simitar, the chosen ORF is the one which Sig- 
nal peptide has the highest score according to Von He- ; 
rjne method 

to [0200] Sequences of full length extended cOfslA 
* clones may then rje compared pairwise with BLAST al-- ; 
ter masking ct the repeat sequertces; Sequences cqrv 
tatning at least 90% homology ovef 30 nucleotides may 
be clustered in the same class. Each cluster may then 

<s be subjected to a cluster 'analysts that detects sequenc- ■ 
es resulting from' internal priming or from , alternative 
splicing, identical secfuences of sequences with several 
frameshifts. This automatic analysis serves as a basis 
for manual selection of the sequences. 

so [0201] Manual selects can bo carried out usm 
toniaticalry generated reports fw^ 
length extended cDNA clone During this manual proce- 
dure, a selection is operated between clones belonging 
to the same class as follows. ■ • . 

55 [0202] Selection of full length extended cf3NA ctohes' 
encoding sequences of interest is perfc^med using the ; 
following criteria Structural parameters (initial tag; pofy- ~- 
adenytatrori site and signal) may be checked. Then, ho- 
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mologies with known nucleic acids and proteins may be 
examined in order to determine whether the clone se- 
quence match a known nucleic acid/protein sequence 
and, tn the latter case, its covering rate and {he date at 
which the sequence became public. Sequences result- 
ing from chimera or double inserts or located on chro- 
rw>sorne breaking points as assessed by homology to 
other sequences may be discarded during this proce- 
dure as wefl- .-..(., 
[0203] Erteml^ prepared^ as described 

above may be subsequently engineered^ pbtauj nu- 
cleic acids which include desired portions of the extend- 
ed cDNA using conventional techniques such as sub- 
clonihg, PGR, or in vitro oligpnucleqtkJe synthesis. , For 
example, 3 the extended cONA fe'derived from a gene 
encodnrig a secreted polypeptide, it rray include the fun 
coding sequences {Le. the seque/H^ encojfing the sig- 
nal peptide and the mature protein remaning after the 
signal peptide is cleaved off), the sequences encoolng 
the mature polypeptide (Le. the polypeptide generated 
after ttie signal peptide ^c^eavedpff),^ 
sequences for the signal peptides. , ; , ; 
[0204] , Similarly, nucleic acids containing any .other 
desired portion of the coding seo^nces for t^ etKoded 
protein may be obtained For example, the nucleic acid 
rnay contah at least 10, 12, 15, 18, 20, 23, 25^ 28, 30, 
35, 40. 50, 75, 100, 200, 300, 500, or 10|00 consecutive 
bases' of anertended cDNA 

[020^ . cONA has been obtained, 

it can be sequenced to o^terrntne the amino acid se^ 
quence it encodes. Once the encoded amino acid se- 
quence has been detemT^d, one can create and iden- 
tify any of .the many conceivable cDNAs that will encode 
that protein by simply using trm degeneracy o( the: ge- 
netic code. For example, allelic variants or other nomof- 
ogous nucleic acids can be identified as o*escribed be- 
low. Alternatively,, nucleic acids ^erKxxiingJhe, ftesired 
amino acid sequence can be synthesized in Y&ro. 
[0206] In a preferred e^ntxjdiment, ^,oj^£S^ 
quence may be selected using the toiown copVjn orjpo- 
don pair preferences for toeripst organism b which the : 
cDNA isto be expressed. : ir... v 

[O207) In adoption to PCR.tosed method 
ing cDN As which include the authentic S'end qf the cor- 
responding rnRNA as well as the full protein coding se- 
quence of the correspono^g hybrid- 
ization based methods may ateo.be employed. These 
methods may also be u^ed to obtain the genomip DN As 
which encode the mBMAs ^ f rcwn yirhich the SESTs or 
cpntigated consensus 5* rESTs wore ,derwe^,jmR^^ ; 
cofTesppnding to the !^axl^e^£0N^; ; oi nucleic ackls 
which aire homologous to extended cONAs, SVESTs, or 
conttgated consensus 5* ESTs. Example 18 below pro- 
vides examples of such methods.; 
[0208] Each identified 0«F,rnay be scanned for the 
presence of a signal peptide in the first 50 amino-acids 
or, where appropriate, within shorter regions down to 20, 
amino acfcfe or less in the ORF, using the matrix method 



of von Heqne (Nuc. Adds ftes. 14: 4683-4690 (1986)) 
and the metrication described in Example 12. 

d) Homology to eit her nucleotide or protein sequences 

s ... 

[0209] Sequences of full-length extended cDN As are 
then compared to /|a>cw fV niicleptiq^ sequences. 
Polypeptides encoded £y fulWength extended cONAs 
are then aampared to 

io [0210] Sequencer c4 fujj length extern^ 

compared to known nucjeicacfr 9£ tye : 

vertebrate and EST sequences of Gehbank, EMBL da- 
tabases and Genseq (Derwenfs database of patented 
nucleotide sequences). Futl-lehgth cDNA sequences 

is are also compared to the sequences of a private data- ; 
base (Genset internal sequences) h order to t find se- ; 
quences that have already been identified by :aj>pBbants r 
Sequences of fulHength extended cDN As with more/ 
than 90% hqmolooy o^ 

20 BLASTN or BLAST2N are io^ntif ied as sequences that, 
have already beer) described Matching vertebrate se- 
quences are subsequently examined using FA^TO; fulk. 
length extended cDN^ with r^e than 7^ 
over 30 nudeotides are kJerrtified as sequences that 

25 have already been described. . ;. 

[0211] OWs errcod^ extended cONAs 

as defined in section c) are subsequently compared to 
known amino acid sequences found in public databases 
such as Swissprot, PIR and Gehptept (Demerit's data- 

30 base of patented protein s^uences). These analyses 
were ^rformed using BLASTS with the parameter VV=8 
and allowing a inaximum of 10 matches, Seqveocespt 
full-length extended cD^^'sIkw^ ^o«wf- 
ogy to known protein sequences are, reco^ 

35 ready identified proteins. . ., 4 ., 

[0212] Jn addition, the thre^rframe conceptual trans- 
lation products of the top strand of full-length eirtendied 
cDNAs are compared to publicly l^wn amcno acid se- 
,,• quences of Sw^rot us^ 

40 E=0.001 r Sequences of J ufl-lengttj e^enp%l cDNAs ^ 
with " mow, ^ thaii 70% hcniology. over 30 a^irK) acW ; 
stretches are d^ 

5 Selection of cloned fulHenoth sequences obtained 
45 from the g.ESTs of the present invention 

[0213] Ctoned fud-iehgth extemied cONA sequences 
that have already been characterized by the aforemen- 
tioned computer analysis are then submitted to an aii; v 
so tomatic procedure, in order to preselect fuIHeogJth ex- 1 
tended cDNAs containing sequences of interest 

a) Airtornatic seqveix& presetection > 

55 [0214] , AD complete cloned full-length extenq^ cD-- 
NAs ct^ped .for vector on both ends are consklered 
First, a negative selection is operated in prdter to epmh i 
nate unwanted cloned sequences resulting from either 
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contaminants or PCR artifacts as follows. Sequences 
rnatchinjg contaminant sequences such as vector ONA; 
tRNA. mtRNA/rfWA sequences are discarded as wet! 
as those encoding ORF sequences exhibiting extensive 
homology to repeats as defined in section 4 a). Se- 5 
quences obtained by direct cloning using nested prim- 
ers on 5' and 3' tags (section 1 . case a) but lacking pbryA 
tail are discarded Orrtjr ORFs containing a signal pep-: 
tide arKl ending either before the potyA tail (case a) : 6jr 
before the end of the cloned 3*UTR (case b) are kept; io 
Then, ; OTFs^containtng unlikely mature proteins such 
as mature proteins which size is less than 20 amino ac- 
ids or less than 25% of the immature protein see are 
e4*tmtriateoV : , . ' 

(0215] Then, to ea^ ' 5 
cONA containing several ORFs, a preselection of ORFs 
is performed using the following criteria. The longest 
ORF with a signal peptide is preferred. If the ORF sizes 
are similar, the chosen ORF is the one which signal pep- 
tide has the highest score. according to Von Heijne meth- 20 

•od;' : -/ : "V ';• ] v:j ' . ' ' 
[0216] /j Sequences ^01 full-length extended cDNA 
ckines are then compared pairwise with BLAST after 
masking of the repeat sequences. Sequlerices contain- 
ing at least 90% homology over 30 nucleotides are clus- ss 
tered in the same class: Each cluster is then subjected 
to a cluster analysis that detects sequences resulting 
from internal priming or from alternative splicing, identi- 
cal s^uences w sequences with several trameshifts. 
This automatic analysis serves as a basis for manual 30 
selection of the sequences. 

b) Manual sequence selection 

[0217] Manual selection can be carried out using au- 35 
tomaticariy generated reports for each sequenced full- 
lehgth extended cONA clone During this manual proce- 
dure; a selection is operated between cRihes belonging 
to the same class as follows. ORF sequences encoded ; 
by clones belonging to the same class are aligned and <o 
compared If the hombiogy between ^iVucledtide ser 
quences of clones belonging to the same class is more 
than 90% over 30 nucleotide stretches or itttye hombj- 
ogy between amino acid sequences of clones belonging ; 
tbthesarrieclassisr^ <s 
stretches, than the clones are considered as being iden- 
ticaL The chosen ORF ts either me one exhiMing : 
matches with known amino acid sequent 
one accordoig to the criteria mentioned in the automatic 
sequence preselection section. II me nucleotide and so 
amino acid homologies are less than 90% and 80% re- ' 
spectfvely, the clones are said to encode distinct pn> ; 
terns which can be both selected if they contain se- 
quences of interest. * ; 
[0218] Selection of full-length extended cDNA clones ss 
encoding sequences of interest is performed using the 
following criteria. Structural parameters (initial tag, poty- 
adenylation site and signal) are first checked. Then, ho- 



mologies with known nucleic acids and proteins are ex- 
amined in order to determine whether the clone se^ 
quence match a known nuclebtio^rotein sequence 
and, in the latter case, its cohering rate and the date at 
which the sequence became public. If there H no exten- 
sive match with sequencer other than ESTs or genomic 
DNA, or if the clone sequence brings substantial new 
information, such as encoding a protein resulting from 
alternative splicing of an mRNA coding for an already 
kne^ protein, the secr^ence is kept Exanples of such 
cloned full-length extended cDN^ sequenc- 
es of interest are described ni Example 18; Sequences 
resuttirig from chimera or.oYjiubte Inserts or kx^iedoh 
chromosome breaking points as assessed by hemotogy 
to other sequences are discarded during this procedure, 

[0219] Extended cDN As prepared as described 
above may be subsequently er^girieered;td ^ 
cletc acids which include desired (k>mons of the extend 11 
ed cDNA using coriventiorial technkjues ^ 
cloning, PQf% or in vitro cfiopnucleotide t s^ttesis.•F6r- , 
example, nucleic acids which include only the full coding 
sequences (i O- the sequences encoding the sig>iat'pep-' 1 
tide arid the mature 7 prc4etn rematnirig after the sig>ial 
peptide is cleaved off) may be obtained using tech- . 
niques ^ known to thos^ skfllecl in me art? Alteirative|y, 
conventional techniques may be applied to obtain nu- 
cleic ackis which contain bhty the ^cooing se^ences for 
the mature protein rerhaintrig after the sigjial peptkfe is 
cleaved off or nucleic acids which contain only the cod- 
ing sequences for the signal peptides:. " - c 
[0220] Similarly, nucleic ackls containing any* oltier 
desired portic+i erf the ceding sequences fw 
protein may be obtained For example, the nucleic acid 
may contain at least Vo, 15; 18/20, 25; 28; 30- 35; 40. 
50, 75/100, 150, 200, 300, 400 or ^c^se<Hrtrve 
es of an extended cDNA." 

[0221] Once an extended cDN A has been obtained,' 
it can be sequenced to determine tKe ari^hd acW 
quence it encodes: Once the encoded amino "acid se- 
quence has been o^errriined; end eah create and irjeft- 
tify any of the many corK^tvable cOrtAs thatwill fencode 
that protert by simply using the o^g^rieracy of the ge- 
netic code For example, aOelic variants or other hbfridl^ 
ogous nucleic acids can be identified as described beP 
low. Aftematrvery, nucleic acids encoding the desired ; 
amino acid sequence can be synthesized in vitro. 
[0222] In a preferred embob^ 
quence may be selected using the known cotton oVcbv 
don pair preference's for the host organism in ; which the" 
cDN A is to be expressed 

[0223] In addition to PCR based methods for obtain- 
kigcDNAs whk^i bclude the authentic 5 , e^c4 me cor- 
responding mRNA as well as the cbrripfete protein coo^ 
ng sequence of the corresponding mRNA, traditional 
hybridization based methods rnay also be employed; 
These methods may also be used to obtain the genomic 
DNAs whictf iencocie the mFWAs from which the 5* ESTs 
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or consensus contigated 5' ESTS were derived, mRNAs 
corresponding to the extended cDN As, or nucleic acids 
which are homologous to extended cDNAs. 5* ESTs, or, 
consensus contigated S' ESTs, Example 18 below pro- 
vides examples of such methods. s 

EXAMPLE 18 . 

Methods lor Obtaining Extended cDN As which Inctude 
the Entire Coding Region and the Authentic 5*End of the 16 
Corresponding mRN A or Nucleic Acids Homologous to 
Extended cDNAs, 5' ESTs or Consensus Contigated 5 1 
ESTs 

[0224] A fulMehgth cONA 0^ 
the strategies described in Examples 1-4 above by re- 
placing the random nonamer used in Example 2 with an 
oligo-aT primer. Alternatively, acDNA tfcrary or genomic 
DNA library may be obtained from a conrimerciaJ source , 
or made using tecrwiques farniftar to those stalled in the 20 
art. ... ■ . , • -v , : • ■ t-s ./ • ,- 

[0225] . . Such cON A or genomic DNA libraries may be 
used to isolate extended cDNAs obtained from 5* ESTs 
or consensus contigated 5* ESTs or nucleic acids ho- 
mologous to extenoed cDNAs, 5* ESTs, or consensus ^ 
contigated 5* ESTs, as follows. The cDNA library or ge- 
nomic DNA Itorary is hybridized to a detectable probe. 
The detectable probe may comprise at least 10, 15, 18, 
20, 25, 28, $0, 2S; 40. 50, 75, 100, 150. 200, 300, 400 
or 500 consecutive nucleotides of the 5' EST, consensus so 
contigated 5* EST, or extended cDNA. 
[0226] Techniques for identifying cON A clones in a 
cONA library which hybridize to a given probe sequence 
are cfisdosed in Sarnbrook et al, Molecular Cloning: A 
Laix>rat6cy Manual 2d Ed, CoW Spring Hartx* Labora- & 
tory Press, 1 989. The same techniques may be used to 
isolate genomic ON As; ,/ 
[0227J Briefly, cDNA or genomic DNA clones which 
hybridize to the detectable probe are identified and iso- 
. lated for further n^ipula^ as follows. The detectable . 
probe described in the pfeceo1r>g paragraph is labeled 
with a detectable label such as a racfotsotope or a fluo- 
rescent molecule. Techniques tor labefing the probe are 
well known and include phosphorylation wrtli polynucle- 
otide kinase, nick translatkx^ /n Wfro transcnpt^ and 45 
non rao"K>actrve techniques. The cDNAs c* gerxxnic 
DNAs in the library are transferred to a nitrocellulose or 
nylon filter and denatured- After blocking of non specific 
sites, the filter is incubated with the labeled probe jpf an 
amount of time sufficient to aDow binding of the probe 50 
to cQN As or genomic DMAs contakiing a sequence ca- 
pable of hybridizing thereto. 

[0228) By varyoig the stringency of the hybridization 
conditions used to identify cDNAs or genomic DN As 
which hybridize to the detectable probe, cDNAs or ge- ss 
ramie DNAs having different levels of homology to the 
probe can be identified and isolated as described below 



1 . Identification of cDNA or Genomic DNA Sequences . 
Having a High Degree of Hornology to the Labeled 
Probe _ 

[0229] Jo identify cDNAs or genomic DNAs haying a 
high degree of hornology to ttie probe sequence, the, 
melting temperajure^ us- 
ing the fdlowing formulas; . 

[0230] For probes between 14 and 70 nucleotides in 
length the melting terr^erature (Tm) is calculated using 
the formula ^-Tjrn?81 :^16.^l^;!jlila-^])-t0.41 (!fact.ion; 
G^H^OO/N) wr^e .N is the length of the probe. 
[0231] If the hybridization is carried Out in a solution 
containing formamide, the melting temperature may be ; 
calculated using the equation Trn=8i.Sf166(log [Na+]) 
40.41 (fraction 
N is the length of the probe. 

[0232] Prehybriolzatidn may be earned out in 6X SSC, 
5X DenharoTs reagent, 0.5% SDS, IQO u^ o^r^tured. 
fragmented salmon sperm DNA or 6X SSC, 5X Den- 
haroTs reagent, 0.5% SDS, 100 ug Denatured fragment- „ 
ed salmon soerrn DNA, 50% fpnriam 
for SSP and DenharoTs solutions are listed in Sarnbrook 
ef aL 4 supra. , r , ; ..^b-^r 
[0233] Hybridization is conducted by adding the de- . 
tectabfe probe to the prehybricHzation solutions listed 
above.; VVhere the- probe comprises double stranded 
DNA, it is denatured before addition to the hybridization 
solution. The fitter is contacted with the hy^ip^tipn sot. 
lutton for a sufficient period of time to allow the probe to 
hybridize to extended cDN As or genomic DNAs contain- 
ing sequences complementary .thereto or homologous 
thereto. For probes over 200 nucleotides in length, the 
hybridization may be carried put at 15-25°^ below the 
Tra For shorter probes, such as ofigpnucleot ide probes, 
the hybrkfization may be conducted at 15-^C, be^w 
me Tilt Preferably, for hybridizations 6X SSC, me hy- 
bridization is corTriucted.at ao^^ Prefer- 
ably, for hybridizations in 50^ formamio^ contahin^ sot 
lutions, the hybridization is conducted at approximately 
42°Cv ..' ■ - ' ,: ; } . 
[0234] AO of me taegohig hybridizations would be 
censored to b^ : ••, ; 

[0235] Following hybridizat^, 4he per is washed in 
2X SSC, 0. 1 % SQS at room temperature 1 for 15 minutes. 
The filter b then washed with 0. IX SSC, 0,5% SDS at 
room temperature fqr 30 minutes to 1 hout Thereafteif; : 
the solution is washed at the hybridization temperature; 
bi 0. 1 X SSC, 6.5% SDS. A final wash fe cpncUJCted in 
0.1X SSC at rck^ temperature, , .■■ ,^ r ; 

[0236] : cDNAs or genomic DNAs which haye hybrW- 
ized to the.probe are identified by atrtorao^ography or 
other conventional techniques.. 

2. ObtaifiingcDNAor Genomic DNA Sequences Having 
Lower Degrees of Homc4oqV to me Labeled Probe 

[0237] The above procedure may be modified to idenr 
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tify cDNAs or genomic DMAs having decreasing levels 
of homology to the probe sequence. For example, to ob- 
tain cON As or genomic DNAs of decreasing homology 
to the detectable probe, less stringent conditions may 
be used. For example, the hybridization temperature s 
may be decreased in increments, of 5°C. from 68°C to 
42°C in a hybridization buffer having a senium concern 
traiion of approximately 1M. Fctowing hybrkiiiat wo, the>" 
fitter may be washed with 2X SSC, 0.5% SDSat the tem- 
perature of hybridization. These conditions are consid- to 
ered to be 'moo^rate' conditions above 50°C and "low" 
coiKlitk)nslfck>w50 o e. > . v . 

. [0238] -Alternatively, the hybridization may be carried 
out in buffers; such as 6X SSC, containing formamide 
at a temperature of 42 P C. In this case, the concentration '5 
of f omiarrude h the hybridization buffer may be reduced s 
in 5% increments from 50% to 0% to identify clones hav- * 
ing decreasing levels of homology to the probe. FoOow- . 
rig hybrkfization, the filter may be washed with 6X SSC, ■ 

• 0i5% SOS at 50? C. These conditions are considered to ■ 20 
be' 1rlrK)derate• conditions above. 25% torrr^amide and 
•few* conditions below 25% formamide. y 
[0239] cON As or genomic; DNAs which have hybrid- - 
ized to the probe are identified by autoradiography. 

3. Determination of the Degree of Homology between 
the Obtained cDN As or Genomic DNAs and 5'ESTs;. 
Consensus Contiqated S*ESTs, or Extended cDNAs?or 
Between -the Polypeptides; Encoded by the <z)btairied 
cONAs or Genomic DNAs and the Polypeptides 30 
Encoded by the S'ESTs, Consensus Contiqated 5*ESTs; 
or Extended cDNAs 

[0240] To determine^ the level of homology between 
the hybridized cDNA or genomic DNA and the 5'ESi; 35 
c^sensiis contigated 5*EST or extended cDNA from ; 
which the probe was derived, the nucleotide sequences.: 
of the hybridized nucleic acid and the 5*E ST, consensus 
contigated 5*EST -or; extended cDNA from which the 
probe was derived are compared The sequences of the 40 
5*EST, exx^sensusc^ > 
from which the probe was derived and the sequences - 
of the cDNA c< genbmic DNA which hybridized to the 
detectable probe may be stored on a computer readable: 
medium as described below and compared to one an- <s 
other using any of a . variety of algorithms familiar to 
those skilled in the art, those described below. 
[0241] lo determine the level erf. homology between 
the polypeptide encoded by the hybridizing cDNA or ge- 
nomic DNA and the polypeptide encoded by the 5*EST, so 
consensus contigated 5'EST^r extended cDNA from 
which the probe was derived, trie polypeptide sequence 
encoded by the hybrkfized nucleic acid and the polypep- 
tide sequence encoded by the 5'EST, consensus conti- 
gated 5tST or extended cDNA from which the probe ss 
was derived are compared, the sequences of . the 
polypeptide encoded by the 5*EST, consensus contigat- 
ed 5*EST or extended cDNA from which the probe was 



derived and the polypeptide sequence encoded by the 
cDNA or genomic ONA which hybrkfized to the detect- 
able probe may be stored on a computer readable me- 
dium as descrbed below and compared to one another 
using any of a variety of algorithms familiar to those, 
skilled in the art, those described below \ 
[0242] Protein and/or nucleic, acid sequence, homolo- 
gies may be evaluated using any of the variety of se- 
quence comparison algorithms arid programs known In 
the art. Such algorithms and programs include, but are 
by no means limited to, TBLASTN. BLASTP-FASTA; 
TFASTAi arid CLUSTALW (Pearson and Lipmari; 1 988, 't 
Proc. Nad: Acad: Set, USA fl5f8/:2444V2448; Altsehul eif 
at, 1990, J. Mql. Bid. 215(3}.4034 10^ThbihVson'ef: > a/., ' 
1994, Nucleic Acids Res; ^2^73^*680; 
at, 1990; - Methods Enzymo!.i26&38&40Z Altsehul ef 
a/,; 1990; J. MoL Bhi. 215(3^40^10, 
W^mtureGenete^ ><r 
[0243] In a particularly preferred embodiment, protein 
and nucleic acid sequence homologies are evaluated 
using the Basic Local Alignment Search Tool ("BLAST) . 
which is well known in the art (see. e.g., Karlin and Alt- 
sehul, 1990, Ptoc. Nail Acad; Set JJSA 8*2267-2^: 
Altsehul etat, 1 990, J MoL Btot 2/5:403-410; Altsehul 
etai, 1993, Nature Genetics 3.266*27% fMscmket aL, ; 
1 997; Nuc. Acids Res. 2£ 3389-3402): Ifi particular; five- 
specinc BLAST programs are used to perform the foir. 
lowing task: : ■tv;v--\->>-.-- 

. ( 1 ) BLASTP and BLAST3 compare an amino acid > 
query sequence against a protein sequence data- 
base; • \.\ ,: 
(2) BLASTN compares a nucleotide query se- 
quence, against a nucleotide sequence database; . 

. (3) -BLASTX compares the Six-frame conceptual 
translation products of a query nucleotide sequence 
(both strands) against a protein sequence data- 
base; ,. :. : ,U; 

: (4) TBLASTN compares a query protein sequence 

' against a nucleotide sequence database translated 
in aQ six reading frames (both strands); arid . 
<(5) TBLASTX compares the sbc-fianieitranslations: 
of. a nucleotide query sequence against the six- ; 
frame translations of a nucleotide, sequence data- , 

.. base. • "c. ' ' ■' ; •:-/ i : t ■•-*._:. 

[0244] The BLAST programs identify homologous se- . 
quences by identifying similar segments, which are re- > 
ferred to herein as Thigh-scoring segment pairs/ be-, 
tween a query ammo or nucleic acid sequence and a 
test sequence, which is preferably obtained from a pro- 
tein or nucleic acid sequence database. Hig^i-scoring 
segment pairs are preferably identified (re, aligned) by 
means erf a scoring matrix, many of. which are known in 
the art Preferably, ^.scoring matrix used. is, the T 
BLOSUM62 matrix (Gonnet et aL, 1992, Science 25S. 
1443-1445; Henikoff and Henikoff , 1993, Prot eins 171 
49-61). Less preferably, the PAM or PAM250 matrices 



25 



49 EP 10334 

may also be used (see, e.g., Schwartz and Dayhoff, 
eds., 1978, Matrices for Detecting DistofKaltetatiorh 
ships: Atlas of Protein Sequence and Structure, Wash- 
ington: National Biomedical Research Foundation) 
[0245] The BLAST programs evaluate the statistical * 
significance of all high-scoring segment pairs identified, 
and preferably selects those segments which satisfy a 
user-specified thfeshoW erf sip/u^^ 
specified percent ripmc^ogy.^ 

significance of a hit^vscor^ 10 
using the statistical sta/iiTKEK^;^ (see; 
e.g., Kartin and Altschul,; 1 990, Propj NatL Acad. Set 
USA 372267-2268). V v-i: > 

[0246] The parameter With the above algo- 
rithms may be adapted depenc^ w Re sequence 1S 
length and degree of fKDrrwIc^ stucrted In some em- 
bedments, the parameters may be the default parame- 
ters used by tha algorithms in the absence of instruc- 
tions from the user. , 

{0247] Wsonteembodim^ 20 
between tbe hybridized nucleic acid and the extended 
cDNA, 5'EST, or 5* consensus contigated EST from, 
which the probe was derived may be determined using 
the FASTDB algorithm described inBrutlag et al. Comp. 
App, Bic«ci .^^7-245, 1990: m such analyses the pa- & 
rameters may be selected as fbucwsrMatr^^ 
tuple=4, Mismatch Penalty=1 , Joining Penatty=30. Ran- 
domization Group Length=0, Cutoff Score=1 , Gap Pen- 
alty^, Gap Size P«nalty^.05;V\TBfidcw Size^500 or the 
length of the sequence which hybridizes to the probe, 30 
whichever is shorter. Because the FASTDB program 
does not consider 5* or y truncations when calculating 
rxxTKtogy levels; U the sequence which hybridizes to the 
probe is truncated relative to the sequence of the ex- 
tended cONA, BEST, or consensus contigated 5'EST *s 
from which the probe was derrvedthe Ixxrejlogy levei is 
manually adjusted by calculating the number of nucle- 
otides of the extended cDN A, OTSt or consensus con- 
tigated 5* EST which are not matched or afigned with 
the tiybrkfiztng sequence, oteterrrwning the percentage *o 
of total nucleotides of the hybridizing sequence Which 
the nomnatched or rKDn-alig^ed nucleotides represent, 
and subtracting this percentage from thehbmotogy lev- 
el- For example, fj the hybridizing sequence is 700 nu- 
cleotides « length and the extended cONA, 5'EST, or 
consensus contigated 5* EST sequence is 1000 nuclei 
otides in length wherein the first 300 bases at the 5* end 
of the extended cDNA, SEST, or consensus contigated 
5' EST are absent from the hybridize sequence, and 
wherein the overlapr^ 50 
the homology level would be adjusted as follows The 
norwnatched, rwn-aligned 300 bases represent 30% of 
the lengft of the extended cDNA, ^ST, or consensus 
contigated 5* EST if the overtappirig700nudeotidesare 
100% identical, me a^^ 55 
100-30=70% homology tt should be noted that the pre- 
ceding adjustments are only made when the non- 
matched or rx)n -aligned nucleotides are at the 5* or 3* 
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ends. No adjustments are made if the non-matched or 
non-afigned sequences are internal or under any other , 
conditions. 

[0246] For exahipte, tising the above rnethods, nucle- 
ic adds having at least 95% nucleic ac^ homology, at 
least 90% nucleic ac^ rKXTKDlo^y, at least 97% nucleic 
acid hcnKrfogy, at least 98% nucleic acid hc*nofc>gy, at 
least 99% nucleic acid hocnotogy, or rnorethan 99%mi- 
deic acid rxxjiology k> the.extended cONA, S^SX or 
consensus contigated 5* E^T frorn which the probe was 
derived may be obtained and identified. Such nucleic 
acids may be aflefic variants or. related nucleic ackis 
from other ^ spcx^ .^^imilafly/by using progressively 
less stringent bybrtfrzaticf^c^^ 
and identify, nucleic acids having at least 90%, atfcast 
85%, at least 80% or at least 75% riconilogy to the : ex^ 
tended cDNA, 5 , EST, or consensus comigated.5/.&ST; 
from which theprobe was derived r-^ ; *r h 
[0249| Using the above methods and algorithms such 
as FASTA with parameiers depending on the sequence 
length and degree of tK>rriology stucRed, for example the 
default parameters used by the algorithms in the ab- 
sence otfrnstiuctions from the user, one can obtain riu- ; 
deic adds enaxfing proteins having at least 9i9%, at 
least 98%; at feast 97%. at least 93%, at least 95%, at . 
least at Jeast^ 
hornolog^ to the pro^ 
NA^5fes^prconseris 
the probe Wias derived, in scx^ 
motogy levels can be determbed using the 
opening penalty and the 'default' gap penatty. ariia 
scoring matrix such as RAM 250 (a standard scoring ma- 
trix; see Dayhoff et at, in: Atlas of Protein Sequence and 
Structure, AM 5, Supp. 3 (t 978)). < 
[0250] Aftematively/ the level of polypeptide homolo- 
gy may be determined using the FASTDB algorithm de? 
scribed by Bnjtlag et al. Cofnp. App.BioscL 6:237-245, 
1 990. In such analyses the parameters nray be setected 
as follows: Matrix=rPAM 0. Mupte=2, Mismatch Penal- 
ty^ ^ing Perialty=20, Randomization Groqp 
Length=0, .Cutoff Scored Wndow Size=Sep^ence j. 
Length, <5ap Kenalty=5, Gap Size Penalty=0;t3^, Win- 
dow >Size=50p or the length of trm : rwnw^ 
quence, whichever Is shorter. If ine ■• horrKrfogous ^arninp 
acid sequence is shorter than the amino acid sequence 
encoded by thei extended cDNA. BEST, or consensus ; 
contigated & EST as a result of an N terminal andfor p? 
terminal deletion the resuttsmay be rna/iuatly corrected 
as follows. First the number of amino acid residues of 
the amino add sequence encoded by the extended cQ- 
NA, 5'EST, or consensus contigated & EST^Which ar% 
not matched or afigned with the hc*hotogcus sequence 
is determined Then, the percentage of the length of the 
sequerx* encoded by tho extended cDNA, 5*EST, or 
consensus contigated 5* EST which the norf-matched or 
non-aBgned ammo acids represent is calculated. This ; 
percentage is subtracted from the ftornotogy leyei For 
example wherein the amino add sequence encoded by 
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the extended cDNA. 5"EST, or consensus contigated 5' 
EST is 100 amino acids in length and the length of the 
homologous sequence is 80 amnio acids and wherein 
the amino acid sequence encoded by the extended cD- 
NA or 5*EST is truncated at the N terminal end with fe^ ; s 
sped to the horricrfogous sequence, the homology level * 
is calculated as follows. In the preceding scenario there 
are 20 rK)n i rnatched t hon-aJiyied amino acids in the so- „ 
quence encoded by the extended cDNA, 5'EST, or cqrv- 
sensus. contigated S EST. -This represents 20% of the io 
length of; the amino acid sequence encoded by the ex- 
tended cDNA* ItEST, or consensus contigated 5' EST: . 
If the remaintruj amino acids are 1 005 identical between 
the two sequences, the homology level would be 100%-: 
20%=S0% homology. No adjustments are made if the . is 
non-matched ornon-afigned sequences are internal or ' 
under any other conditions. ^ f . s 

[0251] In addition to the above described methods, ; 
other protocols are available to obtain extended cDNAs 
using 5* ESTs or consensus contigated 5'ESTs as out- . so 
lined in the following paragraphs. ^/ c - , - ^> -v. 

[0252] Extended cON As may be prepared by obtain- 
ing mRNA from the tissue, cell, or organism of interest 
using mRNA preparation procedures utilizing poly A se- 
lection procedures or other techniques known to those 25 
skilled in the art A first primer capable of hybridizing to 
the poryA tail of the mRNA is hybridized to the mRNA 
and a reverse transcription reaction is performed to gen- 
erate a tirst cDNA strand. : ; ^ 
[0253] > The first cDNA strand is hybridized to a second 30 
primer containing at least 10 consecutive nucleotides of , 
the sequences of SEQ :IO NOs 24-4100 - and 
8178-36681; Preferably, the primer comprises at least 
10iM 2,1 5, 17, ;18; 20, 23, 25, or 28 consecutive nucle- 
otides from the sequences of SEQ ID NOs 24t4100 and 3S 
8178-3(5681 ; lri ; some embodiments, the primer com- 
prises more than 30 nucleotides from the sequences of 
SEQ ID NOs 24-41 OOrancf 81 78-36681 . If ft is desired 
to obtain extended cDN As containing the full protein 
coding sequence, including the authentic translation b> 40 
itiation site, the second primer used contains sequences 
located upstream off the translation initiation ^site. The 
second primer is extended to generate a second cDN A 
strand complementary to the first cDNA strand Alterna- 
tively, RT-PCR may berperiorrned as described above 45 
using primers from both ends of the cDNA to be 6bV 
tained v ■-. , ; . < . ==; 
[0254] Extended cDNAs containing 5' fragments of 
the mFWA may be prepared by hybridizing an mRNA 
comprising the sequences of SEQ ID NOs: so 
8178-36681 with a primer comprising a complementary 
to a fragment of an EST-related nucleic acid hybridizing 
the primer to the mRNAs, and reverse transcribing the 
hybridized primer to malte a first cDNA strand from the ; 
mRNAs; Preferably, the primer comprises at least ,10, ss 
12, 15, 17, 18, 20, 23, 25, or 28 consecutive nucleotides 
of the sequences complementary to SEQ ID. NOs: 
24-4100 and 6178-36681 . 



[0255] Thereafter, a second cDNA strand comple- 
mentary to the first cDNA strand is synthesized. The 
second cDNA strand may be made by hybridizing a 
primer complementary to sequences in the first cDNA 
strand to the first cDN A strand and extending the primer 
to generate the second cDNA strand ' 3 - . 

[0256] the double stranded extended cDNAs made 
using the methods described above are isolated and 
cloned, the extended cDNAs may. be cloned into vec^ 
tors such as plasmids or viral vectors capable of repli- 
cating in an appropriate host cell. For example, the host' 
cell may be a bacterial, mammalian, avian, or insect cell. 
[0257] Techniques for isofatrtg mRNA, reverse tran- tj 
scribing a primer hybridized to mRNA to generate a first 
cDNA strand, extending a primerto rrtake a secor^ 
NA strand ccinoplenTentary to the firsts f 
fating the double stranded cDN A and cloning the double 
stranded cDN A are well krK>wntotrK>se skDIed intheart 
and are described inCurrentPwtocoisih Molecular Bh 
ohgy, John Wiley & Sons," Inc. 1997 and Sambropk ef ; 
at, ^tofecular Cloning: A Laboratory/Manual,: Second : 
Edition; Cold Spring Harbor Laboratory Press; 1 989n if • 
[0256] r Alternatively, other procedures may be used 
for obtaining full-length CDNAs or extended cDNAs In 
one approach, full-length or exterided>cDNAs are pre- 
pared from mRNA and cloned Moi double ; stranded 
phagemids as followsr^the cDN A Gbrary %i the double ^ 
stranded phagemids is then rendered single stranded 
by treatments with an enoVDnuclease, such as the Gene 
II product of the phage R and an exonuclease (Chang 
ef at; Gene 127:95-8,. 1993): A biotinylated oligonucle- 
otide comprising the secjUencd of a fragment of an EST- 
related nucleic acid. is hybridized to the single stranded ; 
phagemids:! Preferably, the fragment comprises at least j : 
1 0, 1 2, 1 5; i 7, 1 8/20, 23, 25, or 28 consecutiyeinucler 
otides of the sequences of SEQ ID NOs: 24-4100 and ;! 
8178 36681. - . / \ '" ::>^A 

[0259] Hybrids between the biotinylated oligonucle- 
otide and phagemids are isolated by incubating the hy- 
brids with streptavidin coated paramagnetic beads arid 
retrieving the beads with a magnirt (Fry ©f a/^;£^ex^^ 
tuques, 13: 124-131,; 1992): Thereafter, the resulting 
phagemids are released from the beads and converted 
into double stranded DNA using a primer specific for the ; 
5* EST or consensus cc^tigate^^ 
to design the bx^inytated oligonucleotide- Alternatively, . 
protocols such as the Gene Trapperkit(Gibco BRt)may : 
be used>The resulting o^bte stranded ON A is trans- 
fbrmediintobacteriaL Extended cDNAs or fuO length cD- 
NAs containing the 5. EST or cxxisensus icontigated 
5*EST sequence are identified by colony PGR or colony > 
hybrkiization. , ;■-[$>■'. .-.■/••-• ■ ,v.-it : 

[0260] , Using any df . the above descrft^ rr»ethods ins 
section III, a pfuralfty of extended ^cDNAs cc^atnirigfulH^ 
length protein coding sequences or, jpKjrtioris of the pro- , 
tein coding sequences may be provided as cDN A fibrar- 
ies for subsequent evaluation of &ie encoded proteins • 
or use in diagnostic assays as described below: 
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EXAMPLE 19 

Full Length cONAs ^ , v 

[0261] The pro^o^res described in Exan^ s 
18 were used to obtain 376 extended cDN As or fuO 
length cONAs derived from S* ESTs in a variety ^ tis- 
sues . the f oliowing fist p rovides a few examples of thus 
obtained cDNAs- 

[0262] Using this procedure, the full length cDNA of io 
SEQ ID / NOc-1" (interna! kientiTK^tkxi number 
58^34-2-E7cFL2) was obtained. This cONA encodes the 
signal peptiofe 

IDN02) having a Von Heyne score of 5.5. ■ : v v ; , 
[0263] Using this approach, the full length cDNA of 
SEQ JD. =: tNO.3 ? (internal identification .. number 
4S-1 9-3-G1 tFL1 ) was obtained.>This cDNA encodes the • 
stojial peptide I^KVLUJTAILAVAVG (SEQ ID NO: 4) 
having a von Heijne score of 8.2. > : . .-; : r-. 

[0264] The fuH length cDNA of SEQ I DNQS (internal *> 
identTtcation number 58-35~2-F10rFL2) was also ob- 
tained using this procedure. This cDNA encodes a&g- 
nal peptide LVVLLFFLVTAIHA (SEQ ID NQ:6) having a 
von Heijne score of 10.7. - ■ v\ * iC Jo 

[0265] /Furthermore,^ * 5 
extended or fulMength cDNAs may be screened for the 
presence of Icnown structural or function^ 
the presence of statures, smal I amino add sequences 
which are well conserved amongst the me mbers of a 
protein family The results obtained for tfie polypeptides so 
encoded by a few futMength cDIMAs derived f romS'ESTs 
that were screened for the presence of known proteth 
signatures and motifs using the Proscan software from 
the GCG package and the Prbsite 15.0 database are 
provided below. ' 35 

[0266] The protein of SEQ ID NO: 8 encoded fey 3he 
futMength cDNA SEQ ID NO: 7 (internal o^ig^ion i 
78>8>3-E6-CL0U1 C)and expressed inadutt prostatebe^ 
lotygtotte 

from which it exhfoitsthe cteracleristic PROSITE sfr 40 
nature from positions 90 to 112. Proteins from this wide- > 
spread family, from nematodes to fly, yeast, rodent and 
primate species, bind hyoVophobic Hgamis such as 
phospholipids and nucteofrfesj They are trriostly ex- 
pressed in brain and in testis and are thought to play a *s 
rctfeinceBgrowwanoY^ 

sperm maturation, motility arid in merr*»rane remode- 
Ihg. They may act either through sigriaJ tiansductkxi or 
through oxidpreduction reactions (for a review see Sch- 
oentgen and JoUes, ttBS Letters, 3G9 :22-26 (1995)), so 
Taken together, these data suggest that the protein of 
SEQ ID NO 8 may ptay arole inoeH grovvtrt, rmtufalion 
and in membrane remcKiefing aixVor may be related to 
male fertility. Thus, these protean may be useful in cSag^v 
nosing and/or treating cancer, rieurodeg^neratoe ofe- 55 
eases, and/or disorders related to male fertifity and ste- 
rility. V; - ■■■ 
[0267] The protein of SEQ ID NO :10 encoded by the 



futMength cDNA SEQ ID NO:9 (Eternal designation 
108-01 3^5-Q-H9-FLC) shows homologies with a family 
of lyso^osphoJipases conserved, among eukaryotes 
(yeast, rabbit, rodents artd:human). In ^ 
members of this family exhibit a caldum4ridEepende^ 
prKx>prK^aseA2 activity (Portitta eiakJ.Ani Soc.Ne- 
phro, 9 :1178-1186;(1998)). AII merr^ mis farnfly 
exhibit the active site consensus GXSXG motif of carrp 
boxytesterases that is also found in the protein of SEQ 
ID NO riO (position 54 to 58). In Addition; this protein 
may be a membrane protein with one transmembrane 
domain as fwetfcteoVby: (Ctarps 
and von ; Heyne, CABtOS appfo 
(1 994)). Taken together, those data suggest that the pro- : 
tetn of SEQ ID NO:10 m^ 

tabolism, probably as a phosphol^ase. Thus, jhis prpr 
tein orpart therein, may be useful in cfegnoshg anctfpr 
treating several cfisorders including, but not frnited levy 
cancel, diabetes, and neurcrfeg^e^atrve disorclers 
such as Parkinson's and Alzheimer's diseases; It may 
also be usef ul in rnodulating inftammatory responses to 
infectious agents and/or to suppress graft rejection. 
[0266] ; The protein of SEQ ID NO: 12 ctkmo^ by^ttie 
futMength cDNA SEQ ID NO 11 On^rrialo^igmtion 
10S^004-5-0-D10-FLG) sfK^ws remote hcwotogy toj a 
subfamily of beta4^lactosylu3^ 
served in animals (human, robots; cow andcbfeken). 
Such enzymes, usually type II membrane proteins lo- 
cated in the endoplasmtc jeticulum or in ^e Golgi ajn , 
paratus, cata^es^ the biosynthesis crf glycoproteinSi 
gtycolipid glycans and lactose. Theft characteristic fea- 
tures defined as those; of subfamily A in Breton etaf, J. 
Btochem, 123-1000-1009 (1998) are pretty well com ; 
served in the protein of SEQ IDNQ. 12, especially the 
region I containing the byDrnxjtff. .^ 163-165) 
thought to be involved either in UDB-bnding or in the ? 
catalytic process itself. In addition, : the protein of SEQ 
ID.jNQ:vJ2 has the typical structure of a type U protein. 
Indeed, it contains a short 28-amirtcHad6Vlong N-termi- 
nal taH, a transrnembrane segment from positions 29 tb : 
49 and a large 278^kkhac1d4c^9 CHemiinal tai as 
precBcted by the software TbpPred II (Glaros arid von 
HeQne, CABIOS apptic. Notes;, 10 :685.686 (1994))i ; 
Taken together, these data suggest that ^prbteiri of ^ 
SEQ ID NO: t2 may play a role 
polysaccharides, and of the carbohydrate moieties of ;^ 
glycoproteins and glyootipids ahoVor in cell-cen recogni- ! 
tiorii ThuSi ttws protein may be useful in diagnosing aniy 
or treatkig several types of blsorders ^luo^, :but rK)t^^ 
fimitedto, cancer, atfiei6sclerosis,caro^^ 
ders, autoimmune cTisorders and rheumatic cfiseases im 
ckxfing rheumatoid arthritis ; ; ^ * r > -"V 
[0269] The pioteih 61 SEQ ib NCfc 14 encoded by the 
fulHength cDNA SEQ iD NO: 13 fmtemal designatioo 
i08O09-5-0-A24 : LC) shows extensive ribrix^^ 
bZlP family of transcription factors, and especially to the 
human luman protein (Lu ef ai. r MoL CeiL Bfol.i 17 : ■ 
5117-5126 (1997)))- The match include the whole bZIP 
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domain composed of a basic DNA-btnding domain and 
of a leucine zipper allowing protein dtmerizal [ion. The ba- 
sic domain is conserved h the protein of SEQ ID NO: 
14 as shown by the characteristic ^ PROSITE signature 
(positions 224-237) except for a conservative substitu- s 
tibn bf a glutamic acid With an aspartic actd in position 
233. The typical PROSITE stature for leucine zipper 
is also present (positions 259 to 280). Taken together, 
these data suggest that the protein of SEQ ID NO i4 : 
may bind to DNA, hence regulating gene expressioh as . io 
a tran^ript^ this protein may be useful 

in b1a§riroi^^ types of oisorclers 

tru^irig, but not ttmited'b, cancer. " *' 
(02701 Bacterial clones containing plasmkfe contain^ 
ingtrwfullle^^ is 
stored in the inventor's laboratories under the internal 
identification numbers provided above. The inserts rray ' . 
be recovered (rom the deposited materials by growing 
an aliquot of the appropriate bacterial clone in the apK 
propriate medium. Tn§ plasmid DNA can then be isolat- 20 
ed using plasmid isolation procedures familiar to those 
skiBed in the art such as alkafirie t/sis minipreps or large 
scirte alkafine lysis ^ plashiki isotatibn procedures. If de- 
. sireo* the ptaslhriid DNA may be further enriched by ceri^ 
trifu^tibnonacesiurnch^ 25 
chromatography, or anion exchange chrbmatc^raphy 
The piasmid pNA obtained using these procedures may 
then be rhaniputated using standard cloning techniques 
familiar to those skilled iri the art.' Mernatrvery; a PGR 
can be a^>ie wrth v primers desired at 00^1 ends of the 30 
EST insertion. The PCft product Which corresipk>rKte : to 
meS'ESTcanthenbeman^ 
ing techniques lanliliar to thc^e skilled in the art"-' 
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[0271] EST-reiated nucleic acids; f ragrrtehts cif EST- 
related nucleic acids, positional segments of EST-relat- 
ed nucleic acibfe, and fracphehts '"of posit iohat segments 
of EST-retated nucleic acids may be used to express the *o 
polypeptides ^which they encode. In particular, mey m^y ? v 
be used to express 0ST-relate<J pc^ypqplBes^ frac- . 
rnerits of ESitf elated polypeptides, pc^luonal segments 
of EST-reiated pblypepticlesT c< fragments of pos^twhal 
sej^ents of ESf-r^ 45 
irrients, the EST-reiated nucleic ackte; pc«Hk^ secf- .-. " ; 
merits of EST-reiated nucleic acids, and fragments of . 
positional segments of EST-reiated nucleic acids may ^ 
be used to express the full polypeptide (Le: the signal 
peptide and the mature pcrfypeptide) of a secreted pro- so 
teiri; the mature protein (i e the polypeptide generated 
after cleavage erf the signal pep or the signal pep- 
tide of a secreted protein. If desired, nucleic acids en- 
coding the signal peptide may be used to facilitate se- 
cretion of the. expressed protein. It win be appreciated ss 
that a plurality of EST-retated nucleic acids, fragments 
of EST-reiated nucleic acids, positional segments of 
EST-reiated nucleic acids, or fragments of positional/: 



segments of EST-reiated nucleic acids may be simulta- 
neously cloned into expression vectors to create an ex- 
pressibn library for analysis of the encoded proteins as 
described below./ 

EXAMPLE 20 

Expression of the Proteins Encoded by the Genes c : % : 
CorrespbhcTinQ to the 5*ESfs ^Consensus Conttqated 
5 ESTs ; " • ;■; *,> ' v ">' " 

[0272] to express their eticcKJed 

tated nucleic adds, fragments c4 EST-reiated nudeicac- 

kis. positic^ segments of EST-reiated hnudieic acids; 

or fragments of positional segrnente^^^ 

cleic acids are cfcMied into a 

In some instances, nucleic acids encoding EST-reiated 
polypeptides, fragments of EST-reiated porypeptides, 
positional segments of ^EST-reiated pofype^ides/brr^ 
fragments of ppsitionaJ segments oT EST-retated ; > 
polypeptides may be cloned fritb a suitable expre^ibh 
vector. '■ • "- ;v; K ' ' ' .. '; :r ' :; ' ■ -V- 
[0273] In some emlxxfimehtsi -the nucleic acfite ifr 
serted into the expression vector may comprise the cod- 
ing sequence of a sequence . selected 1 from me^oyp^ * 
: consisting :qf 24^4100. In d^er emb^ ; 
ic acids inserted irito the ^ cbrtV * - 
prise may comprise trie full c^ 
nucleotides 1 enec^ me mature : ^ 

rx>lypeptide) of one of SEQ ID NOs> 3721-3811: Iri some 
embodiments^ the nucleic ac«i inserted into the expires- ; 
siori vectdrmay ^ the nudeotkJes of one of the 3 
sequences of SEQ ID NOs: 3721-3811 which erxxxJe - 
the -mature polypeptide \ (Ue- the riudeotides encoding 
the pofyrx^ptide generated aft^r cleavage of the'sigiiaO 
peptide); In further embodiments, the nucleic acio^^in^ ; 
serted into the expression vector may comprise the nu- 
cleosides of 24-652 and 3721-3811 which ehebde-ihe^ 
sig/fal peptide to facilitate secretion of the expressed 
protein Trie'nucfefc acio^ friserted expriesskjrt • 

vectors may also contain sequences upstream of the se- 
quences erKodihg the signal peptide, such as sequehc- - : 
es which regulate expression levels bf sequericeS whicrv 
confer tissue qpec^ s'^' r - ^ >--?^ : : r 

[0274] The 'nucleic ^ acid inserted bito the expression 
vector may encode a rx)lypeptide corhprising the one of ■ ? 
me sequence c4 S^ ID^tt^ 
bob^ments, the hudeic acid inserted into the ^qpressbn 
vector may encode the full pc4ype|>tide sequence (te, i 
the signal peptide and the mature poryper^tkfe) induded 
n one of SEO ID. NOs:; ; 77S©r78i38. In other errtbooV;; 
ments, the nuclek: acid inserted into the expression vec- 
tor may * encode the 1 mature - polypeptide - (i.ei the ; 
polypeptide generated after cleavage of the signal pep- ; 
tide) included in one ^of the sequences of SEQ ID NOs: 
798-7888. In further erribecSmerits, the nucleic acids ^ 
serted into the expression^^^ signal , 

peptide induded in one of the sequences of 4101 -4729 
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and 7798-7888. protern'orpofypepltelp 

[0275] ^.Tlie. ; |iuc^;.acfd. encoding the prtfein or into pED6c^c2 as described above. The resulting 

polypeptide to be expressed is operably feikedtoa pro- pED6dp^cbnstr^^ 

rrtoter n an expression vector using conventional cton- host cell, suph as COS 1 cells. Methotrexate resistant , 

ing tecrmotogy. The expression vector may be any of s cefc are selected anal ^ 

the mammalian, yeast, insect or bacterial expression or polypeptide may t be isolated, purified, or enriched as 

systerres known^ descjited above. :l 

torsandexrxe^ion isysterrearpa^ [0279] - ; ^ ^confirm expression^ protein or 

erf s^Bers^^ Institute (Cambridge, polypeptide, tta rxote^ 

MA) t stratagene (La Jbfla. California). Promega (MaoS- ''to- ceBs exxrtaining a vector with a nucleic acjcl insenven- ; 

son, Wsoam);.^ cxxfirjg ttie protein or po^pept^ are compa^ 

If o>^e£ to er^ance expression aral facOitate proper tacking such an insert Jte expressed p^jns are jjj^ 

protein folding, theco^ tected usinq technk^^^ 

the sequence may be optimized for the particular ex- art such as Coornassie btoe or saver st^^g or us&igj 

pression organism in which the ^ express^ vector is in- antftxxfies against ^ enc*xM, 

troduced, as exptened by Hatfietd, et aJ. r US Patent by the nucieic <^;^ 

No. 5^032,767; "caty recognizing the protem of interest rnay be ge 

[0276| U^JbBowihg is provided as one exemplary ated using synthetic : 15-mei ; p^lWes teyjng a se- ; 

method Id express ttie proteins encoded by the nucleic quence encoded by appropj^ 

acids descried above. In some instances the nucleic . 20 synttetic peptic 

acid encocfing the protein or polypeptide to be ex- tfco^tottef^^ 

pressed deludes a methionine initiation codon and a [0280] If tfie proteins or polypeptides encoded by the 

polyA signal If the nucleic, acid encoding the polypep- nucleic acid inserts are seer eted, medium prepared . 

tide to be expressed lacks a methionine to server as the from tte nostra^ 

indiatkxa site.an initiatingr^^ be introduced 2s sion vector which contains a 

next to thefirst epeton of the n ing the desired protein w ^ to; 

tional techniques. Sk^ila^ B the nucleic acid encoding mo^eum prepared from the. control celte or prg^jsm, ; 

the piioteiri or ppfypept^ to be expressed lacks a pofvA The presence 

sigriat, ; this ;seo^enco cjan-^ added to the con^ruct ^y, taming the nucleic acid ^ert^icfi is abserit frorn prep- > 

fc^ example, splicing put, tfie polyA sig/ttMrpiji pSGSi 30 arationsfr^ 

(Stratagene) using ^11; ar^ or pc>lypept^ encoded by the nuclei ack^ insert is be- 

ase enzymes and Jrtcc<pc^t^ it into the mammaBan ^ ing expre^eol ar^ secreted Ge^ 

expression vector pXTI (Stratagene). pXX1 contains spending to the protein encoded by the nucleic acid vi- 

the LTI^ and a portion of the gap gene fr<om Mp^x^ sett will have a mobility near that expe<^^^ 

Murine Leukemia Virus, The position of meLTRsb^e number of amino acids in the open reading J frame of the 

construct aBow eJT^ nucleic acid insert However, the bancJrrjayr^ 

includes mer^rpes Simplex trryo^ bUty differe/rt t^ 

and|heselectabtene^^ cations such as, gtycosytatioh.; u^quitmatipn, or enzy- . 

coding th> polypeptide to matic cleavage. ^ - ^ ^ 

PCS from t^ [0281J Memalively, il the protein expf es^ f romth^ 2 

primerscornpler^ above express]^ spo^e^ces 

protein or polypeptide to be exrxessed anoVcont^^ c%ect^> its secretion, the prc4eiris expressedf ro^ host 

restriction endonuclease sequences for JRst I iricorpp^ ceBs wtaintng ah expression ye^^ an fisert erv : 

rated [into the 5'primer and Bglll at the 5;eod pf<3T primer; v coding a secreted prjkem c*p 

taking care to ensure that the nucleic acid pr*^ 45 pared tp the prot^ 

protein or poVpepttde to be e taining the expression vector without an ir>ser^ The 

sitipned with respect to the poly A signal The purified presence of a band in samples frorri cells attaining ttie 

fragment obtained from the resulting PCR reaction fe ci expression vector wfth an in^ wh^fe^ 

gested with Psti, blurU ended with an , exoruidease, dr- ptes from cells containing ^e expression vectcv wifliout 

gested with BgllLpunTied and Ogate^ 50 an insert indicates that the de^ed p^ 

taining a poly A signal and digested wRh BgQI. . tiiereof is being express^ % J^nd 

[02771- Sgateot product is transfected irtto mouse- have the mpW^ 

NIH OT3cetb usingl^ portwn triereof^^ 

Grand Island, New Yprk) under conditions ptn&ned in the o^erent tt^ that e>pec4 

product specT^ion.^p<et 55 such. as glycbsylatipn, ubiqurlination, or. enzymajiCj 

edattergro^gtrietr^^ deavage. : ^ 

(Sigma, SL Louis, Missouri). £02«2J The expressed protein or polypept^ 

[0278J Altematrvery, the nucleic acid encoding tiie purified, isolated or enriched using a variety of methods: 
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In some methods, the protein or polypeptide may be se- 
creted "into tHe culture-meolum via a native signal pep- 
tide or a heterologous signal peptkte operabty linked 
thereto. In some melhbds, the protein or polypeptide - 
rnay be Ikiked to a Ketefol04i)ou$:r^^ fa- s 

cifitates its isotat^/puriftcatibn, or enrtchmehtsuch as 
a nickel binding pbtyfieptide: The prbtetri or polypeptide 
may also be obtained by gef electropforesis, ion ex- 
change chrorrato^raphy. size chromatography; hpfc, 
sattprecpit^ 10 
any of Jhe^prec^ing. rriethods, or any of itfe ; isolation, 
purification, or eWricnmeht techhkjues familiar to thfcse 
steWwHrfc^ . r f /. *■ " ' * ] V •' - r 
[0283] • ' The prbteiri encoded by lire fcuc^ : acid ! insert : 
may also* be purified using staridard immurKk^rorr^og- ' • '5 
raphy tecrmk^es\jsrng Emmuribal^ ^ 
with antibodies^ protein or 

polypeptide as 5 descrfced in more detail below. If arith ■ 
body' production is ri64 posstole, trie -nucleic acid insert 
encoding the desired protein or polypeptide may be in- 20 
corporated into expression vectors designed for use in 
purification schemes employing chimeric rx^fypeptides! 
In such strategies, the coding sequence of the nucleic 
acid insert is Ggateti infrah^wrth the 
other half of the dfwrVe^ 25 
may be p-globin or a nickel binding polypeptide. A chro- 
matograpfiy rr^trrx tia^g or nickel ; 

attached thefe^ 

teiri. Protease cleavage sites may be engineered be- 
tween tHe p^tobih gen^ c< the nickel birkfing porypep- 30 
tide arid tne exterft^ cONA or pbhion thereof/ thus/ 
the two rx>typejplibW^ 

from one ^ahoiherby protease olg^tron. : iy ' 

[0284] One useful expression vector for^ 
glbbin chimerics is pSG5 (Stratagene), which encodes 3$ 
rabbit f^g^ 

rtates splicrig c4 the expressed transcript, and the poly- 
ader^twri sigjnal ihco'r^iorated into the construct 1 in- 
creases* the Tevef pi expression These techniques as * 
described are' W of *o 

molecular biolc^S^ in t; 

metribds texts such ai Davis iai^i^ki^Medi^W^ 
Atei^^V ffiofc^K bG: Davis; Mb. Dibher^ arkl iLfP 
Battey, <e*±; Elsevie r Press, fiY, i §86) and many of the" 
methods are available from Stratagene, Life Technbb* *s 
gies, Inc., or Promega. Polypeptide may ac^fitionaDy be 
produced from the coiis^rucf uSm^ 
systems sucli as the in vM> Express 11 * Translation Kit ; 
(Stratacjene): : ,> ' ■* *' v: ' :t; - ,v: " " , 

[0285]* so 
. proteins or fk^eptktes encoded 07 u^ 
serts, the punrieb prcrtem may be tested fw tie ability : 
to bind to the surface ^ various 1 ceO types as described : ' 
in Example 21 below Itwfll beap^eciatedthata pli^M 
ity oi proteins expressed frorn these r^ ss 
may be included in a panel of proteins to be simultane- 
ously evaluated for the activities specifically described 
below, as well as omWbfblogicaJ roles for which assays 
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for determining activity are available. 
EXAMPLE 21 

Analysis of Secreted Proteins to Determine Whether v 
they Bind to the CeH Surface 

[0286] THb EST-related nucleic acids, fragments of 
EST-related nucleic ackis, positional se^^ 
related nucleic acids, fragments of positkxtal segments 
of EST -related "nucleic acids, nucleic ackJs erKX>ding the 
EST^relaied polypeptides, nucleic acids encoolng f ragK 
merits of the EST^elated ^ 
ertebding positional £egmehts^ 
tides;' or hucteic acids encoding i ragrnents of rx>srtk>nai 
segments' of " EST-related jx^rypeptides ; are^ck^eci .mid ; ' 
expression vectors! suon as those 1 descried iii Example 
20. The' encoded proteins or rjotyperitides are purified, 
isolated, or enriched as described above. FofJcwng pu-/ 
rificatioVt/ ~isc4alKVv of "^pdhm^p'''tMe prdeiris or! 
polyp^tides are labeled using tec^iqu^" 
those skiBed in the art: The labeled^ prc4eins or pcfype^ 
tides are 1 incubated with ceils or cell hhes derived from 
a varTety v of orgaris or tissues td allow : u^e I 'pfbtetr^ to"; 
bind to any receptor present bri u^^ceR surfaced FofloW-* - 
ing the incubatibrt, the cells are washed to remove noh^ ;; 
speclficany bound proteins or polypeptic^S; thV ^specff-^ 
ical ly bodhd labeled proteins 'or porypeptrdesare detect 1 : } 
ed6yautc<ao^g>aphy: Att 

or jx>rypejptides may be incubated with the cells and de- 
tected with antibodies having a detectable label, such : 
as a fluorescent molecule, attached ftereta^ > i&'-l* 1 
[0287] ; Specificity of ceB surface binding nSy be ana- 
lyzed by ccriductrng a cbmpetitioh analysis iff wh 
iou^amounts ^ 

cubated along with the labeled protein or polypeptide: r 
The amount of labeled protein or polypeptide bourtd to 
the cell surface decreases as i the amc^ntc^ competitive - 
unlabeled protein or polypeptide increases. As a control; ; 
various amounts of an unlabeled protein or porypeptide i 
unrelated to the labeled p^ 
ed ih scnie : bin^^ The-ahteurW^^^o^^ 
rxoteiri or r>DlypeptidebourMJ to 
decrease in binding reactrons * oon increasing 
amounts of unrelated unlabeled prc^eih.lnalcaUng that - 
the protein or po^pepttde encoded by the nucleic acid ; 
binds specrficaBy ^ to the cell surface. ■ 1 

[0288J ^As' cOscfesed abcve, humari proteins haVe 
been show to l^e a numbered 
effects and, ccmequehtly^represe^ valuable thera^' 
peutic resource! The human proteins or polypeptides r 
made as described above may be evaluated to deter- 
mine their ph^tologk^.activrties a^ de^crtocid below ; 
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EXAMPLE 22 

Assaying the Expressed Proteins or Polypeptides for 

Cytokine, Cell Proliferation of Cell Differentiation 

Activity ' ' "'• ■ "" \"''\\'T ] 5 

[0288] As discussed above, some human protein^ act 
as cytokines or may affect c^^pfol^eraUon or d$ter r 
erttiatkm, Ma^ proten to date, kv 

eluding all foown ^ cytojfciries, have exhibrted activity in to 
one orraore fartqr o>per*d^ c^I projif 
and hence ^ assays serve.^ 
tion oj cytokine^ a^ of a ftfotein or 

pojypeptifecrf^ 

one of a number of foutinjl^ ,5 
eiatw assays fcf ceil lh^ 

32D^t>A2, DA1G, T10, JB9, B9/11 » BaF3, MC9/G, M* 
(preBM+i 2E8, Rte, 0A1,, 123, T|t 65, HT2, CTLL2, 
TF-1, Mo7c and £MK. lp? proems or porypeptties pre- 
pareq* as o^ribed^ 20 
ability to regulate j pr.tt^^ r 
says such as.trK»e,o^scrt)^ above or in pe following 
ref erences: Current ftgtopols to Ed. by JL 

E P^gan et a^ Greene Pi^is^ 

k^nterscience; i Takai et, v a£ . , J. ImmunoL 137: & 
34^-3500, 1 l^rta^g^fi # aL J. Irnmunol. g|4S: 
17<Hfe1£12^l 
opy^33:3^r^1; 1^ 

149^l£3?83,i irnmunol, 152: : 

17^76l/l^,0 v . A , , v-.--.: 30 

[0290] In addition, numerous assays for cytokine pro- 
duction anpYpr proliferation of spleen cells, lyrriph 
node cells and thyrnocytes are krx>wij,: These include 
the technioAie^; 

munojo^ efal Ed? , 1:3 1^3.1214, 35 

John Vv^ and Ft 
D. • 4n Cunpnt fttocpjs. in Immunology^wpra 1 ; 

6.8,1^8. ' ' ,/ .. /;^ r , \ } v '-. . ... : ;-..::A.>^ ::- 

[0291] The proteins or polypeptides prepared as de- 
scribed above may also be assayed for the ability 1o reg- . *> 
utate the rxofifera^ a^erentiation of hematopoi^ 
eticor lyrM^^^9^^ Many assays for suc^ activfty 
are familiar to those skilled in the art, indudtng the as- 
says in i the following references: Bottomry era/., InCtffr 
mnt Protocols m Immunology, swpia l : 6.3 1^6,3.12;; ■ 45 
deVries et aL, J. Exp ■ Meo*. 1 73; 1 20©r.1 21 1 , 99J; IV : 
Moreau et at, jN^fum 36^90^92, 1^ r Greenberger ; : 
efa^ PtocN^U >kadL Set 80^31-2938,190?; 
Nordan, R./ln Cun-entf^p^cobh 

1 : £6.1-6.6.5; Smith ^^^PiPOC,NatL Acad.Sci U.S. so 
A 83:1857^1^1, 1986; JBfermett eJamCt/mmf Proto- 
col? ir? immunology supra ;1 : 6.15 r 1 ; Qtarletta ef afln 
CmwnlPf^dcb^mlnvnunohgy.suptai : 6.13-1- 
[0292] The proteins or polypeptides prepared as de- 
scribed above may also be assayed for their ability to 55 
regulate T-cefl responses to antigens. Many assays for 
such activity are familiar to those skilled in the art, in- 
cluding the assays described in the following referenc- 



es: Chapter 3 {In vtiw Assays for Mouse Lymphocyte 
Fur^t^)i;Ctepter 6 (Cytoknies and Their Cellular Re-, 
ceptors) arid Chapter 7, (Immuriologic Studies in Hu-; 
mans) in Cumnt Protects 

berger etaL, jpioa t^LAc^^^^ US$ 77:^1^60^5, 
198Q; Weinberger ©fa/., Eur. J. . Immun. 11:^05-41 1 , 
1981; Tarfcai at aL, J /mm^ 
Takaietal,J:t/^^ ^ 
[0293] Those proteins or polypeptides which exhibit 
cytokine, ceB DfolHeration. or ceB dffierentiajiori activity 
may trjmbe formulated as phannaceuticais and use<J , 
to treat clinical conditions in which inductipn of ceil pre*: 
Weration or differentiation is benefictaL Alternative^ as , 
descra^ ih rn^ below, nudeip. ac^ ^ 
these f^e^ w pd^ nucleic acids fe^ufel; 

ing theexpression of th^p^etnsorpc>yp 
be intrpduced into appropriate host cells to increase or 
decrease the expression of the proteins or rx>|yp^icjies 
as desired. ' r . 

e)6^iipi^23 • 

Assaying the Expressed Proteins or F^>lvpeptides for 
Actrvfty as Immune System Bequlators . 

[0294], I^eJpr^ 

sc^^^^ema^ als<> be evaluated f o^meir eff^s as b 
immune regulators, For example, . the proteins q or 
polypeptides may be evaluated fw ; ^ir activ^ jtp influ; : 
ertce ^yrmx^e ^ Numerous 
assays for such actrvfty are fami^^ i? ; 

the art including trie assays descrtoed in me follovwig 
references: Ulster 3 (/n v^oA^ys f or Mouse iyrnr 
pho^eRunc%iai-a 

studies H umans) - .0}, < . Cjjneni f^otoopfe Jrt , 
Invnuoology, QoHgan etaL Eds, Oreer^F^fehmg • 
Associates and Wley^nter science; Herrii^na.ft aL,^ 
ProcNatL Acad.Sci 1/^78:2488-2492, 1981; r^mr 
arm efaJl;^ /mrrM^. 1^ 

aL, J. /rrrairW 135:1564-157^ 1985; Takai e# at »/. 
ImmunoL 13t:3494-350p, 1986; "fekaijet at ' 
noL 140:50^12, 19^; Pqwman ef a£ # J. VEn^y 61: 
1992-1998; Bertagrtoia ef aL jmmunci 133: 
327-341, 1991; Efrpwn et aL, d - Immunol, ^si; 
30m^O92, %9^. - y . t t: [ ,. : I 

[0295) The proteins or jx>fypeptides pfep^ed as de- 
scried a^ernay also be evaluated ^for tl^.effecte on; 
T-ceB dependent irnmunoglobu6n responses ar^ ; ^ 
type swaching. Numerous assays for jsuch ^ activ% are } 
familiar to those skilled in the a^ ^up^ the assays; 
disclosed in the ipUowkig references:: ^IbTiszewski, J. 
Immunol. 144:30^30^19^^^ 
Protocols Mnumnohgy, 1:3JQ.i$£^S;supfBL 
[0296] The proteins or polypeptides prepared as de- 
scribed above may also be evaluated for their effect on ; 
Irnmune effector cells, including their effect onTThl cells 
and cytotoxic.lymphoc^ Numerous assays for such 
activity are famQiar to those skilled in the art, including 
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the assays disclosed in the following references: Chap- 
ter^ {in vitro Assays for Mouse Lymphocyte Function 
3. 1-3. i9 j and Chapter 7 (fmhnuriotorpc. Studies frr. Hu- 
mans) in Current Protocols in Immunology, supra; Takai 
et atyJ. rmjiu/f^ 137:^94-3500 $ 
ItiiMhci. 140:508^512. 1988; ferlaikili e/ a/., J.lrn^ 
muW. 14^377^83, 199^^^^^/^^V ■ 
. [02&7]" The prcrteWbr pc^y^ptides prepared as de- 
scried above may also be evaluated ^ theu 1 effect on 
dendritic cell mediated activation of naive T-ceBs. tihK 10 
morbus assays' for such a<^rvity are familiar to those 
skilled tri the art including the assays & in the 

f 64 lowing references:' Gu8fy.\^;a^j.- Immunol. ^134: J 
536^44; ^^ilnaba^ : ^:J;^.JI^ 173:549-559, 
1 991; r^catoriia et aV jy iii0unqL 1M:5071^079, 
19?6f Porgador et at J. Exp Wed ^82:1255-260, 1995; 
Hair et al;, X Virol 67:4*2^4669, 1 9S3; f Huari§ ef at:, 
Sti$noo 264:961-965, 1994; Macatbnia et ai d. Exp. 
Med tk&i 255^264, 1 98$ Bhajxfwaj et at; Journal of 
Clinical investigation 94:797-807, 1994; and Inaba ef 20 

■ a*;;/^vi%;i7^ ^ ' r - :rr - " v 

[0298J The prc4eiris or po%ieptides prepared as de- 
scribed above may also be evaluated for their influence 
on the : lifetime 'of iyrnphbcytes^ for 
such activity are familiar to those skilled in' trie art ■ 1 25 
dudtri g the rS&sa^'cfi^close^-irfilicf followgreferences: 
Dar^ftewcz el a l ; (fyomeiry 13:795-808. 1992; 

efat?6&i6&^ Itoti et ai, CeU 

66:233-243^ 1991; Zacriarchuk, il^lmmuhol 145: so 
40374645; 1 990;1^ai ^ ^omelry 14:891 -897, 
I^Gbrczyca eta^lnt J. Oncol 1 :639-648,' 1992; 
(02^' ^The proteins or polypeptides prepared as de^ 
scribed above may also be evaluated for their influence 
on early steps of T^ceii co^itrtfent and development 35 
Numerous assays for such activity are famftiar to those 
skilled tri the art, irKitudinC; WitfKHit limitatkxi the assays 
disclosed in the fc^Iowmg references: Antca er at, Shod 
84:111-117, 1994; Fine a/. Cell Immunol 155: ' 
111^1^1 9^ t995; 40 

Toki tyal, Prod; Afet A^^ScvXfSA 88:7 : £48-7551, 
" 1991; ^ * ' ^ - ^ . M.ti, 

[0300] thc^ pi^ 
activity as immim^ 

be formulated as phanriaceutrals'ahd used to treat clin- 45 
k^fcondrtioVis in which 

beneficial; For example; the protein or polypeptide may 
be useful tri the treatment c4 ^nous irrmurie'deficien- 
cies and disorders (irk^uolrig sewre combined irnrriun- 
c<fefkHerK^), e;g , in re^latB>g (up or down) growth and so 
prcrfife^atkxv^ well as ef- 

fecting the cytolytic «^Kffty of N and tfther cell* 
populations; These immune c^icienctes r^ be genet- 
ic or be caused by viral (e g , HIV) as well as bacterial 
br'fun^inf^ ss 
orders. More specifically, inf ectious diseases caused by 
viral, bacterial, fungal or other infection may be treatable 
using the protein br'^ by 



HIV, hepatitis viruses, herpesviruses, mycobacteria, 
Letshmania spp.', ptamodrum. and various fungal infec- 
tions such as candidiasis- Of course, in this regard, a 
protein or polypeptide may also be useful where a boost 
to the immune system generally may be desirable, Le.,i 
in the treatment of cancer. • 

[0301] Alternatively, me proteins or polypeptides pre- 
pared as described above may be used in treatment of 
autoimmune cHsorders including, for example, connec- 
trve tissue qlsease, multiple sclerosis, systemic lupus 
erythematosus, rheumatoid arthritis, autoirrmiune pul- 
monary inftarrvnatiph, Guillain-Barro syrkirome, autoim- 
mune thyroiditis, msuRri 1 dependent diabetes mellrtis, 
myasthenia gravis, g/alt-yersus-host disease arid au- 
toimmune inflammatory eye disease. Such a protein or 
polypeptide may also to be useful in the treatment of 
allergic reactions arid p^dltioris, such as asthma (par-/ 
ticularty allergic asthma) or other respiratory problems. V 
Other conditions, in which immune suppressiori is de- 
sired fr^luc^g, fc< exarr^ organ transplantation), 
may also be treatable using the protein or polypeptide- : 
[0302] Using the proteins or polypeptides of the inven- 
tion it may also be possible to regulate immune respons- 
es either up or down. Down* regulation may involve in- 
hibrting -or blocking' an irrirnune response already in 
progress or rnay inVolve preventing the hduction or ah 
immune response. tiie^uTKtkx^s of activated T-cells 
may be inhibited by suppressing T cell responses or by 
Mucing specific ; lolerahce in T ce!ls,"or both. IrnrriuricH' ; 
suppression of T ceB responses is generally an active 
non-aritrgen-spe<^fk:process which requires continuous 
exposure of the T cells to the suppressive agent. Toler- 
ance, which involves bxfiicing rKxi-responsiveness or 
anergy &i T cells; is distinguishable from immunosup- " 
pressiori in that it is generally antigen^pectfK: and per- 
sists after the end of exposure to the tolerizing agent 
Operationally, tolerahce can be demonstrated by the 
lack of a T cell response upon reexposure to .specific 
antigen in the absence of the tolerizing agent : . 
[0303] Down regutetlngcf preventing «ie^w 
tigen funct ws J 

phocyte ahtig^ fur^ic<is; such as, for example; B7\ 
cosUrriulation), e.g;; preventing high level ryrr^ 
synthesis by activated Tcells, will be useful insftuations 
of tissue, skin arid .organ transplantation and in grall- 
versus-host c5sease (GVHD). For example; blockage of 
T ceH function should result in reduced tissue" destruc- 
tion in tissue trarisplantation- Typtelly, ri tissue trans- 
plants, rejection of the transplant is initiated through its 
recognftkxi as foreign by T celts, followed by an immune ■ 
reaction that destroys the transplant The administration 
of a molecule which inhibits or blocks interaction of a B7 
lyniprKxyte antigen w^th fts natural Kgand(s) on immune 
cells (such as a soluble, nrxx>6meric fcwrn of a peptide . 
having B7-2 activity alone or in ceo junction with a mon- * 
omeric form of a peptide having an activity of another B 
rymphoc^e antigen fe:g.; B7-1, B7-3) or blockirig antir 
body), prior to transplantation, can lead to the binding 1 
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of the molecule to the natural figand(s) on the immune 
cells /wthout ti^smftting the cwresponping costtmuia- v 
tory signal Blocking B lymphocyte antigen function in 
this matter prevents cytokine synthesis by immune cells, 
such as T cells, and thus acts as an immunosuppres- s 
sant Moreover, the lack of costirrKilation : may, also be , 
sufficient toanejnjizet^ inducing toter- 

ance in a subject Inductioivof long^enn tolerant by B 
tyrnp^pcyte artf ig^rt^locking « reagents may .avoid the 
necessity of repeated aoVnnistratwn of these, blocking, : 10 
reagents, To achieve sufficient immuno^ or , 

tolerance in a subject, ft may ajsojbe necessary to block 
the function of a combinatiqn of B lymphocyte antigens. 
[0304] The efficacy of particular blocking reagents in 
preventing organ transplant rejection or QVHD cah t>e ^ 
assessed using animal motels that are predictive of ef- 
ficacy in humans. Examples .of appropriate systems 
which can ^ be used indup^ aBp^enefc c^ grafts; ,|n 
rats and xer^eneic pancreatic islet cell grafts in mice, 
both of which have been used to examine the immuno- 20 
suppressive effects oT0^^^ v/vo 
as described in Lenschow ©f al ySc^nce 257;789-79? ;l 
(1992) and Turka et al , frroc^Natt. Acad. Sc* USA,1&. 
11102-11105 (1992). In addition,: rnunne models of 
G\rTTO(seeFfeu|ed,^ 25 
Press, New York, t989* ,pp. $^847) can be used to 
determine the effect of blocking B lymphocyte antigen 
function m vivo on the. development of that disease. . 
[0305] .BIbcldng antigen function may also,te;thera- 
peuticalry useful for treating .autoimmune , diseases, 30 
Many -autoimmune. disorders are the result of. inappro- 
priate activation of T cells that are reactive, against self 
tissue and which promote the prooVction of cytokines 
andautoantbodi.es in vc4ved in the pathptogy of tile dis- 
eases: Preventing rthe activation of autoreactive T cells, 3$ 
may reduce or efimihate disease syniptoms. Adminis- 
tration of reagents which. Week costinwlaUon of- T cells.; 
by Disrupting receptor/ligand interactions. ;Of ; B lym- 
phocyte antigens can be used to inhibit X cell activation 
and prevent production, of autoantibodies ojr ;T cjelWeT 40 
rived cytokines which potentiaBy involved 6r> the disease ■ , 
prccess, Adjffi^ 

tigen-srjecific tcrferance of autoreactive T cells which 
couJoVleadjo long-term ,r^ief^froni:the;cfis>as«. Ttieefr 
ficacy of bkx^g reagents in presenting of aBev^wig 45 
autoirnmune^;cfisofp^ a 
number otwefl-characterized a^ . 
autoirrwriunedis^ases^^^ 

tn^tal autotmmune et>cepte systemic tupu^eryth : 
matosis in MRLipr/pr mice or NZ$ hy^ mice. murine so 
autpirrwnuix>coflagen arthritis, ;-diat^tes : 'nie^^;in'<W 
mice and BB rats, and murine ejqperirnental myasthenia 
gravis (see Paut ed., Fundamental Imnwinblogy, Raven 
Press, New Yorki 1989, pp. 840B56) 
[0306] . Upregulation of an antigen function (preferably ss 
a B lymphocyte antigen function), as a means of up reg- = 
. ulating immune responses; may also? be useful in ther- 
apy. Upregulation of immune responses may involve ei- 



ther enhancing an existing immune response or eliciting 
an initial inrununejesppnse as shown by the following 
examples. For instance, enhancing an immune re- 
sponse through stimulating B lymphocyte antigen func- 
tion may be useful in cases of viral infection. In addition, , 
systemic viral diseases such as influenza, the common, 
cold, and encephalitis might be alleviated by the admin-, 
tstration of stimulatory form of B lymphocyte antigens 
systemicalry. .. 

[0307] . Alternatively, antiviral immune responses may 
be enhanced Hvan infected patient by removing T cells 
from the patient, costtmutating the T cells in yUro with ; 
viral antigen-pulsed APCs either expressing the pro- 
teins or polypeptides described above or. together with 
a stimulatory form of the protein or polypeptide and re^> 
kitroducing the in vitro primed T cells into the patient ..- 
The infected cells would now be capable of delivering a 
costimulatory signal to T cells in vivo, thereby actuating, 
the Tcells. V ', ...;v:> 
[0308] In another application, upregulation or ern 
harw^ment of antigen function (preferably B rymphocyte , 
antigen function) may be useful in the induction of tumor 
immunity. Tumor cells (e.g., sarpprna, wjanoma, lym- 
phoma, leukemia, ireuroblastoma, carcinoma) trans- 
f ected withpnepfthe aboverO^scribe^ nucleic apids en- 
coding a protein or polypeptide can be ^ ao^r^ere^to 
a subject to overcome turnqr-specjfic tolerance, in the : 
subject If desjred, the tumor cell can be trarefected to 
express a combination of peptides. For example, turnpr { > 
cells obtained from a patient can be transf ected ex vivo 
with an expression vector directing the expression p| a ; 
peptide having B7-2-like.^^ or .in conjunctiDn 
with a peptide having ; B7-1-like ac^ 
activity. The transf ected tumor ceRs are returned to the 
patient to resuft in expression of the peptides on the sur- * 
face .of the: transf ected cell. Afternatrvely; gene therapy 
techniques can be, used to target a tu/iw cell for trans- 
fection in vivo. ■-. ^ : ■ ? -i 

[0309] The presence of the prc4ein or oolypeptioVBi en^ 
coded by the nucleic acid> descnT>ed above haying the 
activity of a B lynip^ocyte antigen(s) on the surface, of > 
the turrnw cen prcyio^ 

rial to T cells to induce a t cell mediated "^urie rer 
spphse against tjhe transfeqted tui^cellsfln^^a^ 
tumor cells wh^^^^^ which faWto reexpress suffir 
dent amounts of MHC class I 'or M W,^ molecules ; 
can be tra^ected wjto^ a r 

portion of (e.g.^ a cy^lasmk><Jpn^ tnjncated portion) 
of an K^C class I a chain 1 and fe rniCTogJot^Un w 
MHC class II a chairi arid an MHC class It p cha^ tp 
tf^reby express MHC class I or MHG, class II pre4eir^ 
on ttie, cejl suiiace, respectiyery. Expression of the ap: ;; 
proprtate MHC class j or dass II n^ecules hi eonjunc- . 
tion with a peptide haymg the activity 6f 'a B ^ lymphocyte 
antigen (e.g. , ^-1v B7-2, B7^ T ceD medi- 

ated inrimune r esponse against the ^ Iransfected tumor; 
cell. Optionally, a. nucleic acid eric^hg an antiserise. 
construct which blocks expression of an MHC class II 
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associated protein, such as the invariant chain, can also 
be cotransfected with, a ON A encoding a protein or * 
polypeptide having the activity oi a B lymprK)cyte anti- 
gen to promote presentation of tumor associated ant i- 
gens and induce tumor specific immunity. Thus,- the in-; s 
ducttori of a T cell mediated immime re5|^ 
man Subject may be sufficient to overcome tumor-spe- . 
ctffe tolerance in the subject' Aftemativety, as described 
in hicVe iielai! befow, nucleic acidls encoding these im 1 
rriune system regulator proteins or polypeptides or nu- io 
cteic acids regulating the expression of such proteins or 
polypeptides may be introduced into appropriate host 
celts to increase or decrease the expression of the pro- [ 
terns as desired. -r-v-?-:-. -'y 1 -*; 

EXAMPLE 24 ' / : ' v 

Assaying the Expressed Proteins or Polypeptides for 
Hematopoiesis Regulating Activity v ' 

[0310] V-'frKe prbteiins; or polypeptkJesi encoded ' by the 
nucleic acids described above may also be evaluated 
focmelr herrta^^^ 

the effect of the proteins w pojypejDtia^s on embryonic " 
stem ceB differentiation Numerous' 25 

assays for such aclivify are familiar to those Skilled inf ' 
art, thchib^ng the assays disclosed in me^foilowirig 
references: 'Jd^ fet 15:141-151; 

1 9$5; i&teV ei^i^CCet BtbL^ 3l473-48fe, 1 993; Mc- 
Ciariahan eta/; B/bocf 81 :2903-2915; 1 993/ 30 
[0311] " The proteins or polypeptides encoded by the 1 
nucleic acids described above also be evaluated 
fc^ uieir influence on the lifetime of stem cells and sieni 
celt differentiation. Numerous assays for such activity 
are familiar to those skilled in the art, including the as^ 35 
says disclosed in the following references: Freshriey* M. 
G. Methyfcellubse Colony Forming Assays, in Culture 
of Hematc^etic Cells . : : fcl/frreshriey.* et al Eds. pjpf. - 
265-268, Wiley-Uss; trie?; Mew York, N Y; 1994; Hiraya^ 
ma et ai:- Proa N^J^cad^Sa} USA 89:5907-59^ S «> 
1992;i^ieceYlX Hemat^ 
oppietfc Colony Fowtpg CeWMh ^Hijgh l^iferalive' 
Potential In Culture of Hematopoietic Celts RJ; Fresh- ^ 
ney, et al. eds: Vol pp. 23^39, Vvlley 1 ^^ Inc., New York, •' A 
NY .1994; Nebeh et at 1 , Experimental fiehiatolb^T^ 45 
353359.- 1994; PJberriatfier, J . !RtL' C&Mestone' Area 
Forming CeB Assay tri fcuttiire of Hematopoietic Cefls. 
R.L Freshhey; etalJEds! pp. i-21; Vyiley^Jss^lrrc/; New 
YortcNY l i9^; Sfxjb«>Br; E.; Dexter; M. and Allen, T. 
Long Term Bone ( Marrow Cuttures in the Presence of so 
Stromal Cells, n CuRure o^ Hematc>rx)tet>c Cells . R.I. 
Fresfiney^et j^i{Edsipp: 163-179, Wley4jss,Jnc. t New 
York; NY. ^19S4; and Sutherland, H.J. Lbrig term Culture 
Initiating (^0 As^y-tri Cufture of Hematopoietic CeBs . 
R.I. Freshney, et al. Eds. pp. 139-162. VVlley4Jss t Ihci, & 
New York. NY 1994: l \ 

[0312] : Those proteins or pojypeptkies which exhbit 
hematopoiesis regulatory activity may then be formulat- . 



ed as pharmaceuticals and used to treat clinical condi- 
tions in which regulation of hematopoeisis is beneficial. 
For example, a protein or polypeptide of the present in- 
vention may be useful in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lym- 
phoid cell deficiencies. Even marginal biological activity 
an support of colony forming cells or of factorrdependent 
cell lines rruftcates involvement in regulating hematopoi- 
esis, e.g. in supporting the growth arid proliferation of 
erythrbid progenitor cells. alone or in combination with . 
other cytokines; thereby indicating utility, for example, 
in treating varices anemias c*f^ with 
irraotetion/chefTK^^ production of 

erythrokf precursors and/or erythroid cells; in supporting 
the growth i arid proliferalioh bf^hiyeldid cells such as , 
granulocytes and rrKinocytes/maooph^^ 
tional CSF activity) useful, for example';- iri conjunction 
with chemotherapy to prevent or treat consequent my- 
ek>suppression; in supporting the growth arid protifer : 
atiori ^megakaryocytes and consequently of platelets , 
hereby allowing prevention or treatment of various 
platelet disorders such as thrombocytopenia; arid gen- ; 
erally for use in place of or compfimehtary to platelet 
transfusions; arid/or in supporting the growth and profit- : 
eration of hematoooietic stem cells whic^ arie capable * 
of maturing to any and all df the abovenrheh^ he- 
matopoietic cells arid therefore find therapeutic utiDty in 
various stem cell disorders (such as those usually treat- 
ed 1 ■ with 1 ; trarisplantion, including; ; without limitation, 5 
aplastic anemia and paroxysmal rKk:turhal herrM3glor> 
inuria), as well as in depopulating the stem cell compart-^ 
meht post irradtatibryche>TK>therapy, either cn : vivo <bifex- 1 
vw6'(Le,, bi cohjurk^^ 

tioh 'or with peripheral progenitor cell transpbhtation 
(homologous or heterologous)) as normal cdlls or gje- 
neticaEty manipulated for gene therapy. Alternatively, as 
described in more detail below, nucleic acids encodirig 
these proteins or polypeptides or nucleic acids regular^ 
Brig the expression ^c»f tr^ese pibteinsorpolypejptides may ; 
be introduced into appropriate host cells to increase or - 
decrease the expression of the proteins as de&u&L >- 1 

EXAMPLE 25 ■ 

Assaying the Expressed Proteins or Polypeptides for : ' 
Requtatioh of Tissue Growth 

[0313] v The proteins or polypeptides encoded by th&} 
nucleic acids described above may also be evaluated 
for their Meet on tissue growth. Numerbus assays for 
such activity are laminar to. those skilled in ;th i e;-£ut > ^ffi i --- ' 
ctudirig the assays disclosed in International Patent 
Pubtic^iori Na 

ficatiori No: WG95^05846 and International Paieht Pub- 
Ik^tiohNo.W091/D7491. ' V 

(0314) -Assays for wound healing activity include; 
without limitation; those described in: Winter,^ Epidermal 
Wound Healing, pps. 71-112 (M^ch, Hi and Rovee. 
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DT, eds.fc Year Book Medical Publishers. Inc., Chicago, 
as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978) 

[0315] Those, proteins; or polypeptides which are in- 
volved in the regulation erf tissue growth may then be s 
tomtutated as pharmaceuticals and used to treat clinical 
conditionsin which regulation of tissue growth is bene- 
ficial Fjor example, : a protein or polypeptide may have 
utility in compositions used for bone, cartilage, tendon, 
ligament anoVor nerve tissue growth or regeneration, as 10 
wefl as for wound heating and tissue repair and replace- 
ment, and in the treatment of bums, incisions and ulcers. 
[0316] , A protein or polypeptide, encoded by the nucle-. 
ic ac4ds described aboy induces cartilage anoV 

or bone growth in ciicumstanc^^ bone is not nor- 
malty formed, has application in the healing of bone frac- 
tures and cartilage damage or defects in humans and 
other animals. Such a preparation employing a protein 
or polypeptide of the invention may have prophylactic _ 
usefoctosedasweflas 20 
in the Inr^oyed [fixation of artificial joints Do novo bone 
synthesis reduced by an osteogenic agent contributes : 
to the repair of congenital. Uaurna^t^^ 
resection induced craniofacial defects, and also *s use- 
ful in cosmetic plastic surgery. - 25 
[0317] > A protein or polypeptide of this invention may 
also be used in the treatment of , pericoontal disease, 
and in other tooth repair processes. Such agents may : 
prowoVa/v environment to attract bone-forming cells, 
stimulate growth of bpne^orming cells or induce differ- so 
entiation of progenitors of bone-forming cells. A protein 
of the invention may also be useful in the treatment of, 
ceteoporosis or osteoanliritis, such as through stimula- 
tion of bonaanoVor cartilage repair or by blocking inftam- 
mation or processes of tissue destruction (collagenase as 
activity, osteoclast activity, etc ) mediated by inflamma- 
tory processes. ti 
: [0318] Another category of tissue regeneratkxi activ- 
ity that may be attributable to, the proteins or polypep- 
tides encoded by the nucleic acids described above is 4o 
tendorvlgament formation. A protein or polypeptide en- 
coded by the nucleic acids described above, which in- 
duces tendcWfigament-Oke tissue or other tissue forma- 
tion in circumstances where such tissue is rwt rtormaUy. 
formed, has application in the heafifng of tendon or liga- fs 
ment tears, deformities and other tendon or ligament de- 
fects in humans and other animals. Such a preparation; 
employing a tenooaligament-tike tissue . irxfucing pro- 
tein may have prophylactic use in preventing damage 
tolendonpr Dgament tissue, as wen as use in the inv so 
proved fixation of tendon or Dgament to bone or other 
tissues, end in repairing defects to tendon or ligament : £ 
tissue. De x novo tenoVDrvlig^iment l^e tissue formation; 
induced by a protein or polypeptide of the present in : . ; 
ventwncontrtoutestottere^ 55 
defects of congenital, traumatic or omer origin and is 
also useful in cosmetic plastic surgery for attachment or 
repair of tendons or ligaments. The proteins or polypep- 



tides of the present invention may provide an environ- . 
ment to attract tendon- or lig^menMorrning cells, stim- 
ulate growth of tendpn- or ligament-forming cells, induce., 
differentiation of progenitors of tendon- or ligament; 
forming cells, or induce growth of teridcrvligament ceils 
or progenitors ex vivo for return m vhro to effect tissue , 
repair, l^e proteins or potypeptidesp 
also be useful in the treatment of tendinitis, carpal tunnel 
syndrome and other tendon or ligament defects. The 
therapeutic compositions may also include an appropri- 
ate matrix and/or sequestering agent as a carrier as is^ 
well lawwn iin the art , 0 , 

[031 9] The proteins or polypeptides of the presenting ; 
vention may also be useful for profif oration of neural 
cells and for regeneration of nerve and brain tissue, I 
e., for the treatment of central and peripheral nervous 
system diseases and neuropathies, as well as mechan- 
ical and traumatic Disorders; which hr^^e d^g&n^fa- 
tion, death or trauma to neural cells Tor nerve tissue. ~ 
More specifically, a protein or potypepude may be used . 
in the treatment of diseases of : the periprmral nervous 
system, such as peripheral nerve injuries^, peripheral 
neuropathy and localized neuropathies, and . central ; 
nervous system diseases, such as Abheimer's, Iparkin- 
son's disease, Huntington's disease, amyo^)|^ iat^- . 
al sclerosis^ and Shy^Drager syndrome. Fu^er cptitp-; - v 
tibns .j which, may be treated in accordance wi^h , the 
present inverUtonind dis- : 

orders, such assptnalcord di^ 
cerebrovascular diseases such as stroke. Peripheral, r 
neuropathies resulting frorp chenx^erapy or other 
medical therapies may also be treatable using a protein 
or polypeptide of thejriyentkxi., : y .: ; . .. .-/p;,. 

[0320], Proteins or poryp^tkJes of the hyentkw may 
also be useful to promote better or faster closure of norn 
healing wounds, including Wftrvout j^ita 
cers, ulcers associated with yascutar insufficiency, sur- 
gical and traumatic wc^nds.andtfe^ .-v : V. \. 
[0321] It is ejected that a protein or polypeptide of 
the present invention may also exhibit actryity4dr gertr 
eration or regeneiation of ^ sj^ as organs 

(including, for example, pancreas, Bvei;rnte^tine, ; WqV 
neyskin, endc^liu^^^ 

dtac) and vascular (ir^uding vasc^lai endc^fium) tisv 
sue, or for prompting the growth of ceDs comprising such 
tissues. Part of the paired efle<^ rnay be 1^ ^ibiMon 
ormoc^ sc^n^lp^ 
to generate r < A protein or polypeptide of the invention , 
may also exhibit angiogenic activity ; ";. : 
[0222} A protein w potypeptio^ of the present inven- 
tion rnay also be useful for gut prd^ion or regeneratic« 
and treatment oj iiwg of:^er f^osis, reperfu^ioo.kijury 
in vanoust^u^a^ resuttingfrom system- 

ic cytoldne darna^. : ^. ; ; - -\ • . y - - > 
[0323] A protein or polypeptide of the present irwen- 
tion may also be usef ul for promoting or inhibiting differ- 
entiation of tissues descrft^ above, from precursor tjs* 
sues or cefls; or for inhibiting the growth of tissues de- 
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scribed above. 

[0324) Attemativety, as described in more detail be- 
low, nucleic acids encoding tissue growth regulating ac- 
tivity proteins or polypeptides or nucleic acids regulating 
the expression of such proteins ^'polypeptides may be 
introduced into appropriate 1 host cells to increase or oV 
crease the expression of the proteins as desired. 

EXAMPLE 26 ■ *' * 

Assaying the Expressed Proteifis br-P^ 
Regulation of Re^cw^ttve Hofmones f - 

(0325) The proteins or rx>lypeptides of the present h- 
vention may also be evaluated for their ability to regulate 
reproductive hbrrnones, such as follicle stimulating hor- 
rnohe. Numerous assays for such activity are familiar to 
those skilled in the art, including the assays disclosed 
in the following references: Vale & al, Endocrinol 91 
562-572;= 1972; Ling etali Na turn 321:77^782, 1986; 
VateetaL, Nature 321:776-770* 1986; Mason etaLy Na- 
ture 318:659^663, 1985; Forage e^ai, Pfoc: Natl Acad. 
Set USA 83:3091-3095; 1986: CKapter'6 .12 in Current * 
Protdools in - Vn^unolbgy r if:E. Coligan ef aL Eds. ' 
Greene Publishing Associates arid Wiley-lnterscfece-; 
Taub et aL J. Clin. Invest 95:1370-1376, 1995; L'md et 
ai APMIS 103:140-146, 1995; Mufler et at Eur £ Im- 
munol. 25:1744-1748; Gruber ef a/. J. Immunol. 152: 
586f>5867r :i994; iJclihstonVisf J Immunol. 153: 
1762-1768, 1994. T v 
[0326] Those proteins or polypeptides which exhibit 
activity as reproductive rK>nrx>nes or regulators of ceO 
nxwen>eht may then be formu 

and used to treat clinical conditions m which regulation - 
of- reproductive hormones are benertctai. For example, ' 
a protein r^or ^ypeptide rrtay exhibit actrvin- or inhtoin- 
related activities. Inhbins are chaiacterized by their 
ability to inhibit the release of follicle stimulating hor- 
mone ^FSH), while activins are characterized by their 
ability to stimulate the release of FSH. Thus, a protein 
or polypeptide of the present invention, alone or in het- 
erctfimers with a rnember. of the inhfoai a family- may be 
usefulasacontrjacepthre 

to decrease fertility in female nwnmais and decrease 
spermatogenesis in irnafemarr^ > 
sufficient amourttsof other inhibns can induce infertility 
hi these marrtmats. Alternatively, the protein or polypep- 
tide of the invention, as a hcwTodirner of as a heterocffm- 
er wr^ other protein subunits of the mhlbin-8 group, may 
be useful as a fertility inducing therapeutic, based upon 
the ability of activin molecules in stimulating FSH re- 
lease from celts of the anten^ p^uMary. See, Tcmt exam- 
ple, United States Patent AJ^ff^y. A protein or 
polypeptide of the invention may also be useful for ad- 
vancement of the onset of fertility in sexually immature 
mammals, so as.to increase the' fifettme reproductive ■■ 
performance of obmestic ahtrnais such as cows, sheep 
and pigs. * * " . • ■ '-■ 



[0327] Altemativety, as described in more detail be- 
low, nucleic acids encoding reproductive hormone reg^ 
utating activity proteins or polypeptides or nucleic acids 
regulating the expression of such proteins or polype p- 
$ tides may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins or 
polypeptides as desired. "? ■ 

EXAMPLE 27 

Assaying the Expressed Proteins or Polypeptides For 
Chernctoctic/Chernokinetic Activity 

[0328] The proteins or polypeptides of the present irK 

is vention may also be evaluated for chemb^tic/chem^ 
okinetic activity. For example, a protein or polypeptide? 
of the present inventk>n rnay have chemotactic or chem- 
oktnetic activity (e.g., act as a chemokine) f or rnamma : 
5ah cells, including,' for example, monocytes; fibrob- 

20 lasts^ neutrophils; ; T-cells, mast cells, eosinophils; ep^ : 
thelial and/or erictotheliarcells. Chem 
okinetic protetns or polypeptides can be used to mobi- 
lize or attract a desired cell poputation to a desired site - 
of action: Chemotactic or cherrtokthetic proteins ^ or. 

25 polypeptides provide particular advantages th treatment - 
erf wounds and other trauma to tissues - as well as W< 
treatment of localized infections. For example, attraction w 
of lymphocytes, monocytes or neutrophils to tumors or ■ 
sites of infection may result in improved -immune re 1 -" 

30 sporises against the tumor or infecting agent. 

[0329] A protein or polypeptide has c^ernotactic* ac- 
tivity for a particular cell population il it can : stimulate, r 
directly or indirectly, the directed: orientation or ^ move- 
ment of such cell population. Preferably, the protein or 

35 polypeptide has the ability to directly stimulate directed- 
movement of cells; Whether a particular 'protein or 
polypeptide has chemotactic activity for a rx^jpMlation of ; 
cells can be readily determined by employing such pro- 
tein or polypeptide in any known assay for cell chemc4: 

ao taxis. 

[0330] The activity of a protein or polypeptide of the ' 
invention may, among other means, be measured by the 
following methods: • . ■ ; ' ■-<.■.. , ; . ^. ^ 

[0331] Assays -for chemotactic activity < (which „ will y 

45 identify proteins or polypeptides that induce or prevent 
chemotaxis) consist of assays that measure the ability 
of a protein or polypeptide to induce the migration of 
celts across a membrane as well as the ability of a pro- > 
tein or polypeptide to induce, the acfiiesion of one cell = 

so population : :to another cell poputation. Suitable assays 
for movement and adhesion include, without Bmitation, 
those, described Jn: Cuneht Protocols $n.lmnwrioto&, 
Ed by J.E. Coligan, A.M. Kruisbeek, D:H: Margufies, E: 
M. Shevach, W Strober, Pub. Greene Publishing Asso: 

55 dates and Wiley^intersaerK»,Chapter 6.1 2:' 
6.12,1-6.12:28; taub et aL J, Cfin. Invest 95: ; 
1370-1376, 1995; Lind ef aL APMIS 103:1 4CM 46, 1995; 
Mueller ef a/.; Eur. J, Immunol 25:1744-1748; Gruber 
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etal. J. Immunol 152:5860-5867, 1994; Johnston etal 
J. Immunol., 153:1762-1768,1994. 

EXAMPLE 28 



Assaying the Expressed Proteins or Polypeptides for 
Regulation of Blood Clotting 

[0332] The proteins or polypeptides erf the present in- 
vention may atso be evaluated for their effects on blood 
ctottirtg/ Numerous assays for such activity are famitiar 
to those skilled in the art, including the assays disclosed 
in the following references: ynet ef aZ, JL Clin. Pharma- 
col, 26:131*140,-: 1986; Burtickefat, ThromposisRes; 
45:41 3-41 9, 1 987; Humphrey et^L, Fibrinolysis 5:71 -79 
(1991); Schaub, P^osfapfanrfins 35:467,474, 1988. , 
[0333} . Those proteins or polypeptides which are irK 
volved in the regulation of blood clotting may then be 
formulated as pharmaceuticals and used to treat clinical 
conditions an which regulation of Wood dotting is bene- 
ficial For exarnple, a protein wppVpeptideof the inven- .. 
tion may also exhfoit hemc^atc or thrornbolytic activity. 
As a result, such a protein or polypeptide is expected to 
be useful in treatment of various coagulations disorders 
(including hereditary disorders, such as hemophilias) or 
to enhance coagulation and other hemostatic events in 
treating wounds resulting from trauma, surgery or other* 
causes j A protein or .polypeptide of the hventkjn rnay 
also be useful, for dissolving or inhabiting formation of 
thromboses ari forlreatment and prevention of condi- 
tions resulting therefrom (such as infarction of cardiac 
and central nervous system vessels (e.g., - stroke)). Al- 
ternatively, as described in more detail below, nucleic 
acids encoding blood clotting activity, proteins or,* 
polypeptides or nucleic acids regulating the expression 
of such proteins or polypeptides may be introduced into, 
appropriate host cells to increase or decrease the ex- 
pression of ; the proteins or polypeptides as desired. 

EXAMPLE 29 ./V-/ 

Assaying the Expressed Proteins or Polypeptides for - 
Involvement infteceptor/Ligand Interactions . ^ 

[0334] The proteins or polypeptides of the present in- 
vention may also be evaluated for their involvement in 
receptor/ligand interactions. Numerous assays for such 
involvement are familiar to those skilled in the art, in- 
cluding the assays disclosed in the following references: 
Chapter 7. 7^28.1 r7.28.22) in Current Protocob in Im- 
munology, J.E, Cofigan et aL Eds. Greene Publishing 
Associates and Wiley-lrderscience; Takai et aL, Proa 
NatLAcad. Set USA 84:686*6868, 1987; Bierer ef aL, :J 
J. Exp Med 168:1145-1156, 1988-Roseristein ef al, J: 
Exp. Med. 169:149-160, 1989; Stoftenborg ef at, J. Im- 
munol Methods 175:59-68, 1994; Stitt ef aL, CeB 80: 
661-670, 1995; Gyurts era/., CeVf 75:791 >803, 1993. ; 
[0335] For example, the proteins or polypeptides of 



the present invention may also demonstrate activity as : .' 
receptors, receptor ligands or ffihibrtors or agonists of 
receptor/ligand interactions. Examples of suc^ recep- 
tors and ligands include, without Iknftatkx), cytokine re- 

s ceptors and their ligands, receptor kinases and their fig^ : 
ands, receptor phosphatases and their ligands, recep- 
tors Involved in cell-cell interactions and their ligands 
(including without limitation, cellular adhesion mole- 
cules (such as setectins, tntegrins and their Bgands) and 

io rec^ptw/fic^d pairs involved in antigen presentation, 
antigen recognition arid o^elopment of ceHular and hui- V 
moral immune responses). Receptors and ligands are 
also useful for screening of pcrtential peptide or small 
molecule irthibitors of the relevant receptoifligand irite^ v; 

15 action. A pro* 6 " 1 <^ P^yp e P^^ <^ P)^ 9^^ ^^tion / 
(including, without Omitation, f ragments of receptors and . 
ligands) mayt^e useful as inhibitors of receptor/ligand bv; 
teractioos. Alternatively, as described in rrwre detail be- 
low, nucleic acio^ encoding proteins or. polypeptides *v* 

20 volved .in receptor/ligand interactions: or - nucleic actdsy 
regulating me.expression of such protein^^^ 
tides .may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins or .-, 
c polypeptides as desired. , v ...v" . 

EXAMPLE 30 >v . 

Assaying the Proteins or Polypeptides for. Anti- > ^ 
Inflammatory Activity v 

30 ; . , . ... . . . .. . \..--y.. : 

[0336] The proteins or polypeptides of the present in-, 
vention may also be evaluated fo^ antirinflamrna^ . 
tivfty. The anti-inf lammatory activity may be achieved by - ; 
providing a stimulus to cells krvotved in the inflamrnatpry^; r 

35 response, by inhibiting or promoting cell-cell; interne- 
tions (such as, for example, cell adhesion), by inhtoiting , 
or promoting chemotaxis of cells involved in the mftam-; ; , 
matory process, inhibiting or promoting cell exlravasa- 
* tion, or by stimulating or suppressing production of other • 

40 factors which more directly inhibit or promote an, mflam-j 
matory response. Proteins : or polypeptides ^^ 
such activities can be used to treat inflammato^ 
tions including chronic or acute conditions, indwfing 
without limitation inflammation assaHated with infection^ 

*$ (such as septic shock, sepsis ctf systemiq.i 

response syndrome), ischemiareperfusioniruiry, ehdo^ v 
toxin lethality, arthritis, complemerUHnf^ated hypera-, ; 
cute rejection, nephritis, cytpknne- or chermkine-^- 
duced lung injury, inflammatory bowel disease, Crohn's 

so disease or resulting from over production of ^ cytokines: 
such as TNF or IL 7 1. Proteins or polypeptides of the in- 
vention may also be useful to treat anaphylaxfe.and hy- 
persensMivity to an antigenic substance or material. Al- 
ternatively, as. described in more detail, below, nucleic 

55 acids encoding anti-triflarnmatory activity proteins or 
polypeptides or nucleic acids regulating the expression ; 
of such proteins or polypeptides may be "mtroduced into 
appropriate host cells to increase or decrease the ex- 
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press ion of the proteins or polypeptides as desired. 
EXAMPLE 31 

Assaying the Expressed Proteins or Polypeptides for ' s 
TuWcylnhbffion 

[0337] the proteins or polypeptides of the present in- 
vention may also be evaluated for tumor inhibition ac- 
tivity^ In addition to the activities described above for im- io 
muric4oajcal treatrrient or prevention of tumors, a protein 
or potypeptide : of the invention rnay exhfort other anti- 
tumor actiyfUes. A protein or polypeptide may inhibit tu- 
rra>r grovyth c%e<^iy w 

via . ADCC). A protein or polypeptide may exhibit its tii- 
mor. inhibitory activity by acting tumor tissue or tumor 
precursor tissue, by irihbitirtg fanmatk^ of tissues nec- 
essary to support tumor growth (such as, for example, 
by inhajto^; angiogenesis), by causing production of 
oth^ factors, agents or cetf types wtikih inhibrt turnor &> 
growth, or by supjpressing, eliminating or inhfoiting fac- 
tors, agehts of cell ty^ growth. ; 

Mematrvery, as described in nwe oxtail tetow, nucleic 
acids encoding pr'crteihs or with tumor in- 

hibition activity or nucleic acids regulating the expres- 2S 
skin of such prc4eins c< pc^peptides rnay be introduced 
into appropriate host cells to increase or decrease the 
. expression of the proteins or polypeptides as desired. 
[0338] A protein or polyp^tkie invention may 
also exhibit oh£ w 30 
tiyities or effects: inhibiting the growth, infection or f unc- 
tibn^di^ _c^;l«Sl£ig,- iMe^i6tj».i^enisL including, wrthout 
limit3ticV», bacteria, viruses, fungi and other parasites; 
effecting (suppressrig or enria^ihg>Dodiy character- 
istics, irWudihg, witrkkit lirriiiattbn, height, weijjht "haiir 35 
coloV, eye cctfor; skin, fat to lean ratio 6r other tissue pig- 
mentation, or organ or body riart size or shape (such as, 
tor example, breast augrnehtato 
in Done form w sha^); effecting bw or circadiah 

cycles or rhyth^; etTecting fte fertility of male or female 40 
subjects; effectffig the nrtetabc^snv catabolism, arcibb-; 
listh, process^g, utifca^ of dn 

eiary fat, lipid f protein, carbohydrate, vitamins, minerals, 
cofactprs or other hulritiortal jlactcfs o/ ^ 
effecting behavioral d^ractenstics, inctidbg, '/ without 45 
limitation, appetite, libido, stress, coghrtioh finclutfing 
oogriitive disorders depressive 
disorders) and violent behaviors; prcvkihg analogic ef- 
fects or other pain reducing effects; promotrig different 
tiatkxi and grovyrth so 
omef than hematopoietic lineages; hormonal or endo- 
crine activity; m the case of 'enzymes, correcting defi- 
ciencies of the enzyme arid treating deficiency-related 
diseases; ' treatrrient 'of 'h'^rpfofiferative cUsordefs 
(Such as, for example, psoriasis); irnmuri^tobutiri^e ss 
actrvrty (such as, for example, the abiiity to bind antigens 
or complement); aridlhe ability to act as an antigen in 
a vaccine composition to raise an immune response 



against such protein or another material or entity which 
is cross-reactive with such protein. Attemativety, as de " 
scribed in rriore oxtail bebw f nucleic acids encoding pro- 
teins or polypeptides involved in any of the above rnen- 
tioned activities or nucleic acids regulating the expres- 
sion of such proteins may be Produced into appropriate 
host cells to increase or decrease the expression of the 
proteins or porypeptkles as desired. "° ; - ^ 

E&VMPLE32 / ( "' : ^ . V 

lo^ntificatibn of Protects or Porypeptides which Interact ; 
with Proteins or Polypeptides of the Present Invention 

[0339] Proteins or p^^ 

the proteins or polypeptides of the present irwention,- 
such as receptor jproteins) may be identified usirig ? tw6 
hybrid systems such as the Matchmaker fwb Hylirid 
System 2 (Catalog ftoi k jS04-1 . Clohtech). As Se- ; 
scribed iri the manual accompany ing the kit, nucleic ac- 
ids encoding the proteiris or polypeptides of the present 
invention, are inserted into an expression vector such 
that they are in frame with DN A ericcding the DN A bin^ 
ing oV^mairi of the yeast transcript iorial actrvatbr GAL& . 
cDNAs in a cDNA library which encode proteins or 
polypeptides which m^ htera^ v wi|i) the ; proteins or ' 
polypeptides of the preserit invention I are insertie^'into^ 
a second expression vector such that they are in frame 
with DNA ericoolrig tfie actrvatim ctomain of GAL4Vthe 
two expression plasrntds are transformed into yeast ami;" 
the yeast are plated on selection medium which 'selects 
for expression of selectable markers on each of the ex* 
press ion vectors as we jl as GALA dependent expres- : . 
skxi of trie HIS3 gene: Trar^bnnants capable of §roW- ; 
rig on medium lacking histidine are screenedior GAL#: 
derWndent lacZ expression. Those cells which are pos- 
itive t in both I the histidhie selection and the lacZ assay 
contain plasrhtds encoding proteins 6i porypeptides 
which interact wiui uSe pr^ the:' 
present riventibhr \ ' ^ ; ^ ; 

[0340] Afternalivery, the system descrtoed in Lustig el ': . 
al, Wm^s ot£>iz)^^ 

usee! for identifying molecules which interact wrth the 
proteins "or polypeptides of the present ihvefttkyi: lit ' 
such systems, ^ are^per-V 
formed on a pcol ot ve^ors containing nucleic acid rn 
serfs which encode the proteins or porypeptides Wtiie 
present invention. The nucleic acid inserts are cloned 
downstream of a prorrk^er which drives hi wtro tiari^ 
scription: The; resulting pools ot rhRNAs are introduced 'I- 
rito Xenopus faevis oocytes. 1 The oocytes are then as- ^ 
sayed for a desired activity ; : 
[0341] Aftemativeiy the pooled in vitro transcrtptiofi; 
products produced as described above may be translat- 
ed in vitro. The pooled m vitro translation products can 
be assayed. for a desired activity or for interaction with \ . 
a known protein ^or polypeptide. . 
[0342] Proteins, porypeptides or other molecules ih- 
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teracting with proteins or polypeptides of the present th- . 
vention can be found by a variety of acfclittonal tech- 
niques In one method, affinity columns containing the 
protein or polypeptide of the present inventton can be 
constructed, ih softie versions, of this method the affinity s 
coturnn contains chinieric proteins in which the protein 
or polypeptide of the.present Invwtic^ fe fused to glu- 
tathione S-transf erase. A mixture of celluW proteins or 
pool ot expressed proteins as descried above and. is 
applied to the affinity column. Molecules interacting with 10 
the protein br polypeptide attached to trie c^umn can 
then be tsbteled analyzecf on 2f> electroph^esis 
g^ as descr^ »1Ralrnunsen ^ Etecirophoresis, 1 8, 
588-598 (1 997). Alternatively, the molecules retained on , ? 
the affinity coturnn can be purged |>y electrophoresis _ is 
based methods and sequenced. The same method can 
be used to isolate antibodies, to screen phage dtsptey 
products, or to screen phage display human antibodies. 
[0343] Molecules interacting with the proteins' or 
potypeptkfes present feriyei>tion . can also be 20 

screened by using an Optical Bk*ensor as described in 
Edwards '.jfc'. LeathenSano^ Analytical Bhche^try, 
246, i«€ (1997)/ The main aoVantage of.the method is 
that tt allows the deterriinaton of the association rate 
between the protein or polypeptide and other intefactmg zs 
molecules, thus, it is,po^sft>^ ^ tospecifical^r select 'fee 
ter^i^mo^ hicjhi or low association rate. 

Typically a target sensor sur- 

face (trough a carb<>xyrr^hl c^xtran matrix) and a 
sample of test molecutes is placed with the & 

target n^ecu^. The binding of a test rnolecule to the 
target ^ecjute causes a change in the retractive index 
arKV or mickness. Jt^s change is detected by JJhe Bio- 
sensor provided i( occurs in the evanescent f^H (whfch 
extends few hunpV^ 35 
face)! In these screening assays, the target ( molecule 
canbeoneoft^ . 
invention and the test sample can be a coDection ofpro- 
teins, polypeptides or other molecules extracted, from , 
tissues or cells, a pool of expressed protein^ : 40 

tcffaj peptide anoV or chemical Itoraries. w 
played peptides. The tissues or ceDs ii^om ^k4i the test 
rrx)lecules are extia^ 
[0344] In other ^ 

tio^isimrripb^edand 45 
of unique proteins or pp^pep^s of present ^ , 

tiort : ... ...... . * .... • 

[034S3 ,. To study the interaction of ^e proteins or 
polypeptides of the 'present invention wi]th o^c^, the 
mcrocfiatysts coupled to HPLG method described by so 
V\rang eYa/., Clwo^<#niphia,M t ^ : ^(X^7)or\he 
affinity capillary electrcf)rKxesis meuiod descrtoed by 
Busch era/., J.Chromatpgr, 1^:311-3^X1^^ can be 

used: ; - ; .r, ; . 

[0346] The system descrtoecj in U S Patent No. ss 

5,654,1 50 may also be used to ii^^ 

interact with the proteins or polypeptides of the present , 

invention. In this system, pools of nucleic acids encod- 
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tng the proteins or polypeptides of the present Invention 
are transcribed and translated in vitro and the reaction 
products are assayed for interaction with a known 
polypeptide or antbocfy ^ .^-\ 

[0347] Kwftbeappreceted^ 
that the proteins or polypeptides of the present hivention 
may be assayed for numerous activities in aoUitipn to 
those specffically ehume^^ja^e. For exa^le, the 
expressed proteins i or ppfyjp^^tic|»s m^.te^^^A^ied; t 
for applications invohnng c^f^ 
flammattoh; tumor proliferation or n^^tesis/tnifectioh, 
or c*her ciinical cx^tions. In addition, - the proteins or 
polypeptides may be useful as nutn^ional ao^tec* cos- 
metic agents. .. \ ' ■■Vv^ >'/■':■' 
[0348] the proteins c* ! pci^# jf^^ W\ 
vention may be used to generate antto«fies capable of 
specifically binding to the proteins or polypeptides of the x 
present invention 

antbccbesorpolycl^ usedhBre^ *ar>r^ 

t>c^" refers to a po^^ 

which are comprised of at least one ' : t^^^gt :^3fpaihj-^ 
where a binding domain is formed frorri th^ 
variable domains an anttoopy mplecute ( ^^tqf<^ threer 
dimerisional bini^g spaces with an internal surfabe ; 
shape arKi charge distribution compl^jSrite 
features of an antigenic determinant o\< ah ^fii^gea; [■ 
which allows an irrtf™ V£a^^ 
geri ^tfcodies include recombfriant proteins c«Tipns- 
ing the binding domains, as yiefe ^.fra^ 
"mg fab, Fai>;, Fjfab)^ andPfa^fr^c^T^ 
[0349] As us^ herein, ^ 
the portion I of ari antigen n^iecufe.^ 
speciftcity of the aritiger^ttooc^ reaction. An 'e|>|^" ; 
refers to an antigenic cteterrntnant ^^pf a t ^ty^ti^ T An 
epitope can cornprise as few as ; 3 amino ^ciis iri a spa- 
tial co^ormation yitijch is uhk^ue to 
aDy an; epitope consists of at least ; 6 six*) ^**9 [ 
ar«J more usually at least Wp such arniw 
octe for determining the ar^irip acid^ ^^^^?^M^M0,^s 
epitope induce x-ray cry^B^^^ 
clear magnefe resonance, 

the PepscWi method ; 
ai^1^4. Proa r^atl; ,>V^acl . Sci. '^^M^^^^ 
rcf PMbfcation No: WO 84^^6^; and TOT puttica; 

tiOT^v^8W^b6. •; ' V ,V ' : -\' ;>l ' ; ^;-: • 

[0350] In some embixto^w^^ the antiltxjdl^ mq^O" 
capable of speciftcatly bino^ to a proteh or'^ojypejpr; 
tide erwxkfel by E3T-febted n^'c 
of EST-f elated m*cfeic &^^jpc^b^ 
EST-related nucleic acids ^fragnjaf^^ 
segments ^ EST-related nucleic apicte. In ^ some ^ prjropV 
iments, the anttoody may teca^^ 
tigenk: determinant £r an epitope ih a p^efn or jpplypep^: 
tide encoded by ^^EST-relaiedl nucle^ acids- fiag^nents; 
of EST-re^ed nucleic acte 

EST^rebted nucleic adds or fragrrtents <4 pps^ioml. 
segments of EST-related nucleic acids. ; ; : . 

[0351] In other ernbodfrnents, the antibocfies may be 
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capable of specifically binding to an EST-related 
polypeptide, fragment of an EST-retated polypeptide; 
positional segment of an EST-reiateb polypeptide or 
fragment of a positional segment of an EST-related 
polypeptide. In some embodiments, the antibody may- s 
be capable of binding an antigenic determinant or an 
epitope in an EST^elated polypeptide, fragment of an 
EST-related polypeptide, positional segment of an EST- 
retated polypeptide or fragment of a positional segment 
of an EST-related polypeptide: / l ~ '"' io 
[0352] in the case of secreted proteins, the antibodies 
may be capable of binding a f uIHength protein encoded 
by a nucleic acid of the present invention, a mature pro^ 
teth (i.e; the protein generated by cleavage oi the signal 
peptide) encoded by a nucleic acid of the present iriven- ' is 
tton; or a signal peptide encoded by a nucleic acid of the 
present invention^ ' - , 

EXAMPLE 33 ' " '•" \ ' y ; ■' ' ' ' 

Production of an Antibody to a Human Polypeptide or 
Protein _""-'■>■ ^W^v. 

[0353] ■ TTie above described ESJf^eiated nuclefc ac^ 
ids, fragments of EST-related nucleic acids, positional 
segments of EST-related nucleic acids or fragments of 
positional segments of EST-related nucleic acids or nu- 
cleic acids encoding £ST-related polypeptides, frag- 
meritsolEST^e&t^ 

of EST-related polypeptides or fragments of positional so 
segments of EST^eiated polypeptide^ are operably 
linked to promoters and introduced into cells as de- 
scribed above. ■ V" 
[03$4] In the case of secreted proteins, nucleic acids 
encoding the full protein (Le; me mature protein and the 35 
signal peptide), nucleic acids encoding the mature pro- 
tein (Le the protein generated by cleavage of the signal 
peptide), or nucleic acids encoding the signal peptide ' 
are : operabry tinked td premiers anS introduced into 
ceils as described abdve.^ & 
[0355] the encoded proteirfe or polyr^tkfes are then ' ; 
substantially purified or isolated as described above. 
The concentration of protein in the final preparation is 
adjusted, for example, by concert an Am icon 
filter device, to the level of a t ew ug/mJ. Monoclonal or & 
polyclonal antibody to the protein' or polypeptide can 
then be prepared as follows: 

1. Monoclonal AnttbooV Production by Hybridoma 

Fusion . v " so 

[0356] u " Monoclonal antibody to epitopes of any of the 
proteins or polypeptide 

scribed can be prepared from murine nybridomas ac- 
cording to the blassical method of Kbhter, and Miistein, 55 
Nature 256:4& 1(1975) or derivative methods thereof. 
Briefly, a mouse is repetitively inoculated with a few mi- 
crograms of the selected protein or peptides derived 



therefrom over a period of a few weeks. The motise is 
then sacrificed, and the antibody producing cells of the 
spleen isolated. The spleen cells are fused by means of 
polyethylene glycol with mouse myeloma cells, and the 
excess unf used celts destroyed by growth of the system 
on selective media comprising amiriopterin (HAT' me- 
dia). The successfully fused cells are diluted and alid- 
uots of the dilution placed in wells of a micrc^iter plate - 
where growth of the cutture is continued. Antibod^pfo^. 
ducing cforie^ r are kientriied by detection of antax>dy in 
the supernatant fluid of 'the wells by tmmuncessay pro- 
cedures, such as Efisa, as c<igjnalV o^scrU>ed by 
Engvalt, Metti Enzymol 70:419 (1980); Selected I'pfcsi- 
tive ddnes can be exparided arKf their moruxJciial av : 
tibddy product harvested for use: Detailed procedures 
for monoclonal antibody production are desertoed in 
Davis, L et at ti'Bash Methods Biofogy 
Elsevier, New Yb^ :> Se^kxii 21-2: " " : { l: ' : 

2. Polyclonal Antibody Production by Imrnunizatloh ■ ;i 

[0357] Polyclonal antiserum containing antibodies to 
heterogenous epitopes of a single protein or pctypeptide ; 
can be prepared by immunizing suitable animals with 
the expressed protein or peptides derived therefrom, 
which can be u^ 

nogenicity. Effective poryclorial ^tibody prc^uctki« is . 
aff^ed by many factors 

the host species. For example, small n^ecules? tend to 
be less imnumogenic than others and i may require the 
use^ of carriers and adjuvant Also; host animals re? 
sponsevatydeper^ site of inoculations a'rid dos- 
es, with both inadequate 

resulting in low titer antisera. Small doses (ng level) of 
antigen administered at multiple intra^rmal sites ap- 
pears to be most reliable. Ah effective irnmunizatidri pro- 
tocot for rabbits can be found in Vaitukaitis. & a!Jj;^8n. i 
End^/ici/J^ - ^: M 

[0358] Booster injections can be given at regular in- 
tervats; and antiserum harvested when antibb^ fiter; 
thereof, &s determined semi-quantitatrveiy, for example, 
by double rmmuh^iffusibn in agar agairist kr^b^ 
ceritrations of the antigen, begins to (all. See, for exam- 
ple, ^ O^tehohyi ef a/„ C*jap- 19 Han^b^^ ^l 
perimorttai Immunology D: Wier (ed) Btackwefl (1 973): 
Plateau concentration of antibody is usually in the range '. 
of 0.1 to 0.2 mg/mt of serum (about 12 pM). Affintjy of v 
the antisera for the ahtijgen is ctetenriiried by prepariri 
competitive bMirig curves, as b&scrbed,'for example," 
by Feher, p., Chap, 42 it: hAant^^Ctm^llnymnol'- 
ogy, 2d Ed (Rose and FrieoVnari^Eds.) Amer. Sbcl pdir * 
^crdbipL; Washington, O.C: (1980). ' > • 

[0359] Antibody preparations ^epare^^accorolr^ to/ 
either of the above protocols are useful in a variety of - ; 
contexts. In particular, the antftxxiies may be* used m 
nununoaffintty chromatography techniques such as 
those described below to facilitate large scale isolation, 
purification, or enrichment of the proteins or potypep- 



41 



81 



EP 1033 401 A2 



82 



tides encoded by EST^etates! nucleic acids, positional V. Use of S'ESTs and Consensus Contigated 5' ESTs 

segments of EST related rujcteic acids or fragments 6f or Sequences ^ O^fnable Ttierefrom w portJoriiy 

positional segments of EST-relate& nu^ Thereof as Reagents . ; 

the isolation, punTication ..or enrichment cl E ; v ,.. . _ . ^^-V^x 

polypeptides, fragjnents of E^T^related polypeptides; s [0365] TheE^ 

positional segments of. ESfrrebted po^peptides or rnents of E$^ete^ n^?c 

fragments d posnional segments of EST-related sdional segments of Est related nucj^^ ^^^nnay be 

polypeptides. ' ^ used as reagents jp isolation pro^ 

[0360] In the case ^ of secreted proteins, the antibodies says, and forensic procures. For exarn^,;s^i^ 

rnaybeu^^rtne^^ 10 es frcw the ESTHreteted n^efc acite, ^ 

of the full fKbtein mature protein and the signal mer^^E^t^I^^ nu^eic acids or fr^p/riems df"^ '] 

peptide), the rrtet^^^ sitio^^jm^ 

by cleavage of the signal pe^ide)/ or the sig^ detectabty tabe4ed and used as probes lo^o^e-i^r; 

are operabjy Bnked to promoters and intrcKfcrced Jnfo 6ei^n^<^p\e of hybridizing Jo tfv^V^^^J^ 

cells as d>scr£^ aboya 1S the jh0 E^t^^ 

[0361] AoWbnally,^ may be used in irn- of EST-refeled nucleic acids or fragments of Jp^Mi^: 

munoaffinity chrcfnatogra^ such as tric»se segments of EST-related nucleic acids maybe used to 

descrfced below to isolate, purify, or enrich polypeptides design PGR primers to be used in isolation, c^gnosfc,^ 

which have been fo^ed to the proteins or po^rpeo or forensic procedures, 

encoded by EST-reMed n^^ 20 .->■-:.-.■■'..■■■ - : . . 

ments of EST^f^ed riucfetp adds or fragments ofpo- 1. Use of EST-felatecf nucle ic acto^/t^ftidhal Xi^i. 

sitional segments segments of EST-related nucleic acids or fragments of r 

lat e ^ pofypeptides, frag^ positional segments of EST-related nucleic acids In -..-'^ 

ments of ESTH'elatedpolypepti positional segments , ; isolation, diagnostic arid ^for^ensfc rxocebvfes ; 

cf EST r re^^ frag/nents of positional & ^ . . ^ + \ky-.^' 

segments of EST^elated polypeptides. EXAMPLE 34 
[03621 The art ibotfes may also be used to determine v 

the jcettufer Joca&za^ Preparation of PCR Primers and Amp lification of &MA 

prgte^oc pplyp^tip^ f • : v vo , - 'y--- - : :^:-,^:--. >)^ : 

ic acids, (raitional segments of EST-re^ted nucleic ac- 30 [0366] TTie ; jE^ 

ids qr'fragj^k rnents of ^T-rejataJ ^nuclefc ackte p/ ^gmeote of : pd-| 

nucleic ac^^ttu* c<M*u^ sitional se^nents of EiSTneiated nucleic acidts may ^je . : ; 

potypeptides. fragrnents of EST related polypeptides, used to prepare I^R primers for a variety erf appftca- , 

posftionars^^ t^»N^*fl9 ; ^ 

f ragrrM^rts of positfo^ qf ESJ^elSed ss acids capable of r^rjo^g to 

polypeptides :. '/ nostfclectm^esa^ 

[0363] In addition, the antfcKXftes may also be used to boo^^^ the 18,2p,?9, 

determine the celjylar ^ polypeptides wjViph 25^28, 30, //fa,$X ^ nudeotides in le^gUl In so^ etrir, 

hayebee^^ bdcTiments, the PCR primers may be than 3Q bas- ; 

ed tyEST-w^ 40 es ki length. It ^ js pjref erred that 

EST^elated nuc^ic aci^ qi t raflira^ts ot positb^l appro?o^e^% 

segments of EST ^elateli nucleic acids or p^p^ides peratoe^ are ^ 

whic^hayebeen^ : tectoio^^e^ 

ments of EST^elated pprypeptides, f^ttt^ segments reyiewof POT^noio^ ; 

of EST-rejated polypeptides, or fr^grnents of ppsiticVtal ^ netic Engineering White, BA JEdl^ M&fji^ { 

serpents of EST-retcited r>otyrjMaptides J 7' ;i utar&otogyBt: Humana Press, T^tov^ ^ 1^ ^ea<ph;; 

[0364] The antibodies may also be u^ed in o^antfta- of these PCR procedures, PCR primers dh either side 

tive imrruirroas^^ o^tehnine concentraUais of of the n^leic acid s^uenc^ to^ 

anUg^n4>earing substances en biobgical samples; they . u to a suftably prepared hu^ic acid s^ 

may also used serrM^uantitativeiy or quaiitatiyely to so dNTPs and a thermostable polymerase such as Taq 

identify me presence of antigen tn a biological s^>ie ■ polyrnerase^ Pf u p^^ 

or to identify the type d tissue present in a bk>iogjcal acW in ^^^^ 

san^Ie! The anta>xCes ^ may also be u^ h therapeutic porr^ are sj^^^ 

ODmpositions for killing cells expressing the protein or nucleic aCidi sikfuences k\ the sample. Tlw .hj^n&ed ; 

reducorg the levels of the protein in the bopy ; 55 primers, are extended ^ 

natuiatKJa r^ytKWiza^ 
cycles are repeated mult^te tirT^ 
fied fragment containing tfie nudeic acid sequence be- 
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tween the primer sites. panel of PCR primers based on a number of the EST- 

* v " ! . * related nucleic acids; po& 

EXAMPLE 35. : -\ ' - : nucfeic acids or fragments of positional segments of 

, 1 EST^efeted nucleic acids is then utrtized ki accordance 

Use of the EST-related nucleic acidsV.po^kjnal s with Example 34 to arnpfify DNA ^approximately 

segrnents of EST-f elated nucfeic acl& WMqirnents of : 1 G6-20Q bases in length from the forensic specimen, 

posttibhal se^ents'of ESt-related nucjec acids as ' •• Correspc^irig sequences are obtained from a test sub^ 

probes ' ' " ' v- — ,< ■; ject Each of these identification DNAs is then se-' 

1 . : ; ' quencecl using siano^rt technique 

[0367j Probes oWivec! from: EST-re^eb\nucletc ac- io tabase corrtp^ the oWererK^sV if any, 

ids, positionai segments of EST-related nuctek: acids or between tfie sequences f rorn the subject and those f rdrjfi v 

fragments of positional segments of EST-related nucleic the sample .^tisticaBy 

acids may be labeled with detectable labels familiar to the suspect's DNA sequences and those from the sarh- 
those skilled in the art, incfutfng rao^oisotopes arid rion- * pie conclusively prove a tack of identity! This tack of 

radioactive labels, to provide a detectable probe;' The identity can be proven, for example, with 'only one se- : 

detectable probe may 5 be single straraied or double quence? Identity, brV the other hand, should be demon- ^ 

stranded and may be made usingtechriiques known in strated with a large number of sequences, all matching : 

the art, trtclu^ Preferably, a mfrtmuro of ^ 

or kinase reactions. A nucleic ackl sample conlaai^ a quences of 100 bases in length are used to prove icten- : 

sequence capable iof hybridizirigW tity between thesuspect andthesarnpla ".; r 

contacted with the labeled prcoe. ' - 
thesamp^ • EXAMPLE 37 - r 
to contacting the probe: to 

cfeic acid sample may be imrhobilized oh a surface such Positive Identificatfort by DNA Sequencing } 

asa^itrocen^ 2S ■ : A ^ v -v- ■• >. ■. ,-i .■•"«:;'-•*■: 

sample may comprise nucleic acids dbtainedfrorri a va- " [0371] The techhi^e^outfined in the jpreVkkis exa^: 

nety of 'sources; - in&uairig genomic DMA, cpNA librar- pie may also be used on a lar^rscale to provide a" ■ 

ies, RNA, or tissue samples: ' unique ftri^rp^ 

[0368] Procedures used to detect the presence of nu- this technique, : primers are prepared from a large ! 

deic acids capable of hybridizing to the ^tectable 30 number 61 EST-related nuclefc 

probe include well known techniques such as Southern merits of EST-related nucleic acids or fragments of po^ * 

blotting, Northern blotting, dot blotting, colony hybrtdi- sitional segments of EST-ireteted nucleic acids: Prefer-^ 

zalion, and plaque hybridization. In some applications, ably, 20 to 50 different primers are used: These prirriers 

trie nucleic acid capable of hybridizing to the labeled - ; are used to obtain a correspondtrig number of reR-gferi- 

probe may be cloned into vectors such as: expression : 3S erated DNA segrnents from the individual in question in / 

vect^^sequwin^ veitors, or' irf- vitro transcription accordance with' Example 34: Each of these DNA seg- 

vectors to facilitate the characterizatiw arKJ expression ; ments is sequenced, using the methods set forth in Ex^ 

of the hybridizing nucleic adds tn the sampte; For ex- ample 36. The database of sequences "generated 

ample; s"ucb techniques may be used to isolate and through this procedure uniquely identifies the individual 

clene sequences in a genomic l&raiy or cDWA library <o from whom the sequences were obtained. The same I 

which are capable of rtybridtztng to tie detectable probe v panel of primers may then be used at any later time to 

as o^scr&ed in Example 18 above: f ' : :v ^' absolutely correlate tissue or other biological specimen 

[0369] . PCR primers made as described in Example with that individual ^ - " ^ 

34 abcVe maybe used ^ : - . v> 

DNA rtrigerrjririto^ EXAMPLE 38 ' ; : : : . - 

3&46 below. Such analyses may utilize detectable : : ' J ' =' *■ ; c - : - ^ r • 1 ' : ' r 

ptobesj'br primers based on the sequences of the EST- Southern Blot Forensic Identification 1 • . ■ r - ; 

related nuclek: adds, positioriar ^ ^ - . 

ed nucleic ackiswfragfments^^ [0372] ^The procedure of Example' 37 is repeated to 

EST-related nucleic ; acids. . . < so obtain a panel of at least lO arii^ified sequertces from 

1 antrKfrvidualarKiaspecini^ 

EXAMPLE 36 - v tains at feast 50 ainpRffed seo^ience^ 

>■■;::■* . <-> .;;~t ^ panel contains 100 ampTified sequence 

Forensic Matching by DNA Sequencing ernbbdimentSi the panel contains 200 amplified se^ 

" ; r? i1 . ' : ss o^errces. This PCFVgenefated DN 

[0370] ■ In one exemplary method, DNA samples are with one or a combination ; of • preferably; four base spe- . 

isolated from forensic specimens of / for example, hair, die restriction enzymes Such enzymes are corr»ner- 

semeri, -blood or skin cells by conventional methods. A ctatfy available and known to those of skiU in the art After ; 
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digestion, the resultant gene fragments are size sepa- 
rated in multiple duplicate weOs on an agarose gel and 
transferred to nitroceltulose using Southern blotting 
techniques well known |o those with skill in the art Rx 
a review of Southern blotting see Davis et at (Basic * 
Methods In Molecular Biology, 1986, Elsevier Press, pp 

62^65) * , .,' :.^J..-, •. ■ 

[0373] Apanei of probes based on trie sep/ie^ 
. the EST-related nucleic acids, ppsrionalia^^ <*■■ 
EST-related nucleic acids or fragments of postanal io 
segments of EST-related nucleic acids are r^M^activery 
or coteimetricaRy labeled using methods known ii the 
art, such as nick translation qrend labefirtg, and Itytyi* 
ized to the Southern blot using tec^^ , 
art (Davis et aL, supra); Preferably. ^pi^Jfe.&tast '5 
10, 12, 15, 18, 20, 25, 28, 30, 35.. 40,-50, 75, 1^, 150, , 
200, 300, 400 or 500 nucleotides in length. Preferably 
the probes are at feast 1ft 1& 15,. 1ft 2CV 25; 
35, 40, 50, 75. 100, 150, ^ v 

otides in length. In some ernbodiments, the probes are 20 
digonucleotides which are 40 nucleotio^ in le<}gth or 
less. 

[03741 Preferably, at least 5 to 10 of these labeled 
probes are used, and more preferably at least about 20 
or 30 are used to provide a unjque patterrv The resultant , 2* 
bands appearing of a jarge sanv 

pte of EST-related nuc^ seg^iente of 

EST-related nucleic acids or fragments of posrtional 
segments of EST related ru^eic acio^ 
identifier Since the restriction enzyme , 
different for every trKfiv^dual^; the band pattern, on the 
Southern blot will also be unique. Increasing the nurrtoer 
of probes will pfoykie a statistical^ of corv 

fidence in the identification skice the/e W an in- 
creased number of sets of bands used for iderit'r(ic3ti^: & 

EXAMPLE 39 ^ ... 

Dot Blot Identification Procedure 

. ^. : ... ...... . : _ 40 

[03751 ^ fqrjidentiyi^ 
using the EST-related nucleic acids, positional seflr 
merits of EST-related nucleic acids or fragments of po- 
sitional segments of EST-related nucleic ( acids>o%H 
closed herein utilizes a dot blot hybridization technique. *s 
[0376] Genomic DNA is isolated I rom nuclei of subject 
to be identified. Probes are prepared that correspond to 
at least 10, preferably 50 sequences from the ESJ-r t e^ , 
tated nucleic acio^, positional segments of ESTrrelated 
nucleic acids or .fragments ^of posltipnat segments so 
EST-related nucleic acite. The probes are used to hy^ 
bridize to the genomic DN A through conditions known 
to those in the art Trm ofigcfui^^ 
wfth Pausing pplynucteotide: kinase (Ptarmaciay_ Opto 
Blots are created by spotting the.gerKxnic.DNA onto n^^v 55 
troceflujose or the like using a vacuum dot blot manifold 
(BtoRad, Richmond C^Bfomia). The nftrocettulose fitter 
containing the genomic sequences is baked or UV 



Bnked to the filter, prehybridized and hybridized with la- 
beled probe using techniques known in the art (Davis et 
aL, supra). The 32 P labeled DNA fragments are sequen- 
tially hybridized with successively stringent conditions 
to detect m^imarjaWerences between jmel^:b£ se- 
quence and the jgNAl Tetranethylan^^ 
is useful for iden^l^ mrmbers y 

of nucleotide mismatches (Wood etaL, Proc. Natl AcaiL 
Sa L^^€f)^ A unio^e pattern of , 

dcrts distinguishes one tndrviduaf from another imirvidu- ■ 

aL// 0 V : ^.. . ^ ; J H : ] ^ ] ..:) '■: ■ ,-" - 

[03771 : v EST-re^ed nucleic .^cidsV posfOo^l Sjpg- 
meats of EST^reJ^ too^rite, of po- . 

sitiginal seo^ 

used.as probes « the f oUowtng atternatiye ^ 
technique, In.some embocfenents, the probes are pfigor 
rujcteptides which are ^ in leno^ or l^ss. v 

IpS^^Prefera^ 
quences f rom c^er^ 

tionaJ segments of EST-related nucleic ac^ cf fragt 

me^ojposii^^ 

idsareuse^^^ 

Example 40 telow proyio^sarepresentat'rye 

fing^iprinting proceo^re in which the probes ate derived 

from EST-related nucjefc ^ 

EST-related nucleic acW c< fra 

segments of EST-related nucleic acids. r , r t/^'ly-i 

EXAMPLE 40 -^.Y : "' '[ ' V" V V- 

Memative,Tinqerprint' Identification technique 

[0379] Qli^ucteoti^s are pr^r^ fr^ 
number, eg. 50, juxj, br^QO, EST^e&ednuclefc 
positiorurf ^ 
fragments of posftion^ 
adte using ^ 

service^ such as Genset, f^aris, France. Prefer^)ly,;flTe 
o^onucleptio^^a^ 

or 30 nudeotir^ in len^v However, m ^ 

ments, the oRgdnudeptio^s may be n^e than 30#ir r 

deputes in length. :: , ^ -ry W^. 

[0^80j Cell sannjpi^ fr^ fhe test subject ^ 

essed for DNA using techniques well krwwn tp^^se r 

wrth skiU Jn the art "Rue nucifei^add is ^g^s^ ^ re- 

strfo^on enzyrr^ su^ as Eco^ 

o^gestion, samples ^ ar:e appfied to weBsfor electipphore- ; 

sis. The proce^re/as^ 

to accomrnodate pdyacrylamide electropt>ores|s,^r^ 
ever in this example, samples containing 5 ug of DNA 
are loaded into wells and separated on 0 8% agarose 
gels. The gels are transferred onto nitrocellulose using 
standard Southern bkatt^ technip^es. : , ^ ^ to - 
[0381] 10 ng erf each of the bH^udeotib^ are 
pooled and end-labeled w|th P^. The nftrocellulose js 
prehybridizedv^ 

the labeled probes. Following hybridization and wash- 
ing, the nitrocellulose filter is exposed to X-Omat AR X- 
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ray film. The resulting hybridization pattern win be 
unique for each individual. ' 

[0382] It is additionally contemplated within this ex- 
ample that the number of probe set^ences used can be 
varied for aotittipnaf accuracy or clarity. ! ' ■*■' ' f t ; ' s 
[0383] In addition to their appCcatiohs ihffxensks and 
idehtiHcation, EST-related nucleic ac^ 
merits of EST-related nucleic acids or f ragments of jpo-' 
sitional segrnehts of EST-related nucleic ackfe may be 
mapped to their cbromosbr^ ro 
below de&ribes radiation hyb^ 
man chrorrosornal regions using EST-related riucteic 
ackfc/rx>S[tic*vil seg^ ■ 
or fragments of positional segments of EST^elated nu- 
cleic acids Example 42 befc>w descrbes a representa- is 
live procedure for mapping ESt^etated nucleic acids, 
positional segments oFEST-related' nucleic acids or 
fragments of positional segments of EST-rebted I nucleic / 
acids to their locations on ruwanchrcroosom^ 
pie 43 below de^rtoesr^ 20 
acids, poshiorial segments of ESt¥etate^l ^^ciSfc^jHiads^ 
or fragments c4 pc^rtic^al segments of EST-retafed nu- 
cleic acids on metaphase ch'rorrk^ 
cence In Situ HybricTi^atico (Fl^) " ' ^ f 

2. Use of 'EST -related nucleic acids, positional < : ; " : ; * 
seomehts of EST-related nucleic acids or fragments of 
posiiibnal segments of E^T-retated nucleic acids iri " 
Chrombsorre^^^ > ^ ^ - t <., v * i;-<;;^ 

EXAMPLE 41 ■' - • r -" ' . 

Radiation hybrid mapping^ ESPrelatead nucleic acids, 
posrubhalseqr^ 

fragments of posftionai segrrients oi EST-related nucleic 35 
ackfc toffiefw^ ' : ' " ~ ? " ' ■ ' 

[0384] '" Radiation hybrid (fiH) n^ping is a scwhatic 
cell genetic approach that can be used for high resotiK 
tibri mapping of the human genome; t^ 40 
cell fines attaining one or mof e human cftrorr^ 
are lethally irradiated, breaking I each chrcrrK>sc^ into 
fragments whose size depends on the ration dose. 
These fragments a^^ 

dent cells, yielding subclones containing cBff ereht t>or- 4S 
ttbhs of the human gehbrhe/This technique is described . 
by Benham etal (Genomics 4308-51 7^ 1 989) and Cox 
ef aL, (Science 250:245-250/1 990)" The random arid ; 
independent nature of the subclones pern^ efticteht 
rnapping of any human genome marker! Human DNA so 
isolated from a panel of 80-KJO cell fines provides a 
mapping reagent for ordering EST-refetedhuciec ackis; : 
positional segments of EST-related nucleic acids or 
fragments of positional segments of EST-related nucleic 
acids. In this approach, the frequency of breakage be-, ss 
tween markers is used 16. measure distance, aBowing 
construction of fine resolution maps as has been ddrie 
using conventionaJ ESTs (Schuler ei a/.. Science 274: 



540-546,1996). 

[0385] . FUH rnapping has been used to .generate a 
high-resolution ^whole genome radiation hybrid map of 
human chromosome' 1 7q22-<|25: 3 across the genes for 
growth hc*mohe (GH) ah^ thymkfeie kinase (TK)-.(Fbs- ti: 
tef'-&al/Geridmk& 33:185-192, 1996), the region suV- 
rounding the Gortin syndrome gene (Obermayr et a!., 
Eur. J. Himi Genef. 4:242-245, 1996)1 60 loci covering 
the enure short arm of chrprrk3sbrne12 (jRaeymaekers 
ef ai ; iGenom'tcs 2^170-178^1995); the regjon of hu- : 
rnari c^rornosorne 22 containing the neurofibromatosis. ; 
type 2 fociis (Frazfer ejra£ Gehdn^M:&4-^;*992) % 
and 1 3 feci on the long arm of chrpmc«orne 5 (War- ■' 
rington & ai # Gfebomii^ - . 

EXAMPLE 42 ' / ^ 

Mapping of EST-related nucleic acids, c^ttlonal 
seojmnts of EST^elated nucleic acids or fragments of 
posfttohaf segments of EST-nrelafed nucleic acids to * 
Human CrHbrh&so^ PCR technkiues 

[0386] EST-related'' nucleic ' acids, posftionai seg- 
ments of EST-related nucleic acids or frag/nents of po- 
sitional segments of EST-related nucleic acids may be 
assigned to hurrah chrcm^^crnes using PCR based 
n>eux)oV>logies. In such 5 ^ approaches! ; oliopruicleotid^ 
primer pairs are designed from ESTrolated;nuctefc ac- 
ids; positional segments of EST-related nucleic acids 
fragments of posftionai segments of EST-related nucleic- 
acids to minimize the chance of amplifying through an 
intron. Preferably, the oligonucleotide primers are 18-23 
bp in length and are designed for PCR arnplif ication. The 
creation of .PGR primers f r^orri 1 fchown sequenices isiweff. 
kne^ to tho^ For a review of PCR : ? 

techriblogy see Erticft liri PCiR tecr^ 
and Applications for DNA Amplification. 1992. TO 
Freeman and Co., New York. 

[0387] The primers are used in polymerase chain re- ; 
actions (PCR) to ampBfy templates from total human ge- 
rwmic DNA. PCR conditions .are asfoOows: 60 rig of ge-^i 
nomk: DfJA is used as a template for PCR vnthBO hg of 
each oTtgonucfeotide primer, .0.6 unit of Taq jwryrrterase; : 
ahd-1 u.Cu of a 32P-labeled deoxycytidine triphbsphatei 
The PCR © perfewrned 'fa a micr^ 
(Teciirie) under <he following conditions: 30 cycles of 
94*C, 1.4miri; -SS*C,.2'mki£ .^;7^;-2mih;«^klih^ : 
extension at 72°C for 10 irnini. v The ^ ain^ed (products • 
are analyzed on a 6% jporyacrytamkle se^ 
and visualized by autc^ao^bgraphy. If the length of the . 
resufting PGR product- is identical to the cSstance be^. ^ 
tween the ends of the primer sequences in the 5*EST ; 
from which trm primers are derived, then the PCR r eac-. 
tibn b repeated with DNA templates from two panels of 
human^rc<ient ^sorr^ BIOS PCRaWe 

DNA (BIOS Cor^poratjo^ NIG MS Human-Rodent 
Somatic CeD Hybrid Mapping Panel Number 1 (N1GMS, 
Camden, NJ) : v : v 
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[0388] PCR is used to screen a series of somatic ceil : modeoxyuridirie (5-BrdU, 0.1 mM) for 6 h. Colcerrud (1 

hybrid ce« lines contairang defined sets of hurran ■ ftgfmO jjs acfcted tor the last 15 min before harvesting the 

mosomes for the presence of a given 5'EST. DNA is isp- celjs. {Jeffe are collected, washed in jRPMj, incubated . 

latedfrorhtte with a hypotonic solution of Kpl <7S nriM) at 37°C for 15 

plates WPCR reaction 5 rniri andfoced in three.cnanges of methan^^etk: acid 

EST^etated nucleic acids, po^c^ segments of EST- (3:1) r The cell suspension te *p|)ped onto a glass sftte 

related nucleic acids or frao/nerts, of positional seg- and air dried. The EST^reJated nucleic acids, positional 

nierrts of JESX-re^e^ those somatic . segments of EST-related nucleic adds or f ragments of 

cefl hybrids with t chrprnp^ornes containing the human posttionai segments of EST-refated nud^ acfc^ fe 

genec^es^^ 10 beled with two^^ 

positional sfegrrtents nucleic acids or r to the ^ m^ufacturer's instructior« (Bethesda Research 

fragments of positip^ Laboratories, Bethesda, MD), purified using a Septe^ 

acioVwiO yield^W deX fi-^ column (Pharrnaaa, Opsala, ^ec^)-.and 

assigned to a cWonkKome by analysis of the segrega- > prectptetek^ 

tion pattern of PCR products from the septic hybrid i$ is qlsspjyed in ^ X 

DNA templates. The single human chromosome S$C, 10% dextran sulfate, ^ mghri\ soa^^ saiv^ 

present in all cell hybrids that gKrensetoan amplified. sperm DNA, pH 7) and the probe is &x^<& ^&6 ■ 

fragment is the chrcinbsow EST-re^-, fbr ^lb min; , 

ednucfefca^ v £0392] a Sfidss kept at : 2p°C are tre^ed for 1 h ai37"Q 

cletc actb^c< tog^ 20 v^RNa^;A(lfJOj^ 

related nuctefcaefck 1^ and d^ydrated jn an etoanc^ 

analysis of results from somatic qe8 gene mapping ex- preparations are denatured in70%formamio>^ 

pertments (See Leq1>etter et al, Genomics 6:475-481 • for 2 min at 70°C, then o^yo^ 

(1990)), ),,: ". • ■,.,. \ r- ; .:,i'A - ; , : ; are treated with proteinase K (10 jig/lOO ml in 20 mM 

[0389] Alternative^, the E^efe^ 25 Tris-HCI. 2 mM CaCy at 37*C for 8 min and dehydrat- 

positional segments of EST-related nucleic acids or ed. The hybrk^ioivi^ 

fragrne^tsofposft P*3cedOT^^^ 

acids may be mapped to iridividual <chro robber cement and incubated oyemi^it >i a humid 

R^H as o^scrft>ed to Example 43 below. , chamber at 37°C. After hybrirfzatiori and ^ 

, : f ^1 m- 30 zatton washes, the biotinylated probe fe detected^ ayf:; 

EXAMPLE 43 din-FITC and amplified with additional layers of biotK 

\ \v , J v v .V ^ ., .[ J^ v ^ nylated goat anfrayidin ^ and ayid^lTC. Fp/ 

Mapping of ES^ sc*raj^ 

segments of EST-related nucleic acids or fragments of prevkwis^:^ 

posifexial segments of EST<e|ated nucleic acids to ; : as are observed under a LEICA fl^re^^ce m^rbseppe 

Chromosomes Using , (DMRXA). Chromosomes are cowterstained with pro- 

f - u ^ pidium iodia^ arid the fluw 
nuorescenee In Situ Hvbridizattori , , r pears as two symmeUical yellow«green spcfe ori bctti^ 

? , ; , v : v chrc^tids of thp fluorescent; Rrband ohforr^>some 

[039O| Fluorescence m silu; hybridizafipn. alk>^ the 4a (re<flVThus, a particu^ ackls,ipo- 

E^e^ed micfefcaciq^ sftional segrnents of EST-reteled nucleic acids, or frag^ 

related nucleic acids or f ragnftenfe of posiUpnal -se^ mentsclfwstti^ 

merits of EST^elatecl mideic acids to be mapped to a ids may be tocafized to a partk^r ^ 

particular location on a given chipmpsoirw. ^ on a giyen chromosonw 

riwsomes to $e used for fluorescence s sftu hybridiza- 45 acids, ppsrtiaial segments of EST^Iated nucleic acW^ 

ttatecrtniques rnayb^ or fragments of positipnal segments of EST^e^ed ^ : ; 

es ir^uding cell cultures, tissues, priwhole btop<t : , , ,, cfeic acids have been ^ssig^ to paitoilar chrano- 

[0391] In a preferred emtx^rnent, ehranc^anal Ip- somes using the techniques described iri Exaroples 

cafizatkxv^ 41-43 above, they may be utilized to construct a^ 

ments of E^T-related nucleic ackte or fragments of po- so resolutipn map of the chranpso^ 

sitionaJ segments of ES Jriejated nucleic acids are ; ol>- located or to identify the chrprnospmes in a sarnple, 

tainedby RSH.asb^scribed^b^ ■: ■, 

Acad Set US A, 6f7:66^9f-6643, 1 99|0). Metaphase EXAMPLE 44 

chromosprnesjarp prepared from phytohemagglutin&i , ./ " s .. : > ■ , 

(PHA)-stimulated J^ocd ceO donors. R^-stirnulated Use of EST-rebted nudeiC acids, posftional segments 

lymphocytes fromheaRhy males are cultured fex 7? h in of EST-related nucleic acid s or fragments of positional 
RPMI-1 640 medium. For syrK^rwizatic«% me^otrexate 
( 1 0 uM) is added for 1 7 h, followed by adefition of 5-bro- 
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segments of EST-related nucleic acids to Construct or 
Expand Chromosome Maps ■ * 

[0393] Chromosome mapping involves assigning a 
given unique sequence to a [ particular chiorm^ome as s 
described above. Once the uruque sequence has been 
mapped to a given chromosome, it is ordered relative to 
other unique sequences located on the same 'chromes 
some. One approach to ch^ utilizes 
a series of yeast artificial chromosomes (YACs) be^ . io 
several thousand long inserts derived from the -chrbmo- 
somes of thfe organism from which ^ £STHnelated nu- 
cleic acioVpositiohal segments of EST-related nucleic 
acids or fragments of positional segments of EST-felat- , 
ed nucleic ! acids are obtained. This approach is is 
scribed an Ramatah Nagaraja ef at; Genome Research 
7:210-222; March 1997 ;Briefly; in this approach each 
chrofmsome is broken intb overlapping pieces which' 
are= inserted -ihto'tto-YAO-vectiMi'-TM YAG inserts* are 
screened using PCR br other ^ 20 
whether they include tt>e EST-related nucleic acids; po-v 
sitional segments of EST%etated nucleic acids or frag^ 
mehts of positional segment of EST-retated nucleic ac- 
kte whose position is to be determined. Once an insert 
has been found which includes theS'EST, the insert can . 2$ 
be analyzed by PCR or other methods to determine 
whether the insert also contains other sequences known 
to be oh the chromosome or in the region from which 
the' EST-related nucleic adids, positional segments; of v 
EST-retated nucleic acids or fragments 'of positional 
segments; of EST-related nucleic acids was derived; 
Thb process can be repeated for each insert irithehfAC 
library to determine the location of each of the EST-re- : 
lated nucleic acids, positional segments of EST-related 
nucleic adids or fragments of positional • segrnents of 35 
EST-related nucleic acids relative to one another and to 
other known i chrerhdsomaf ^ 
resolution map of the distrtoution of numerous unique 
markers along each of the organisms chromosomes 
. may be obtained. 4o 
[0394] As descrfced in Example 45 below EST-relat- 
ed nucleic acids, positional segments of EST-related nu- 
cleic acids or fragments of positional segments of EST- 
related nucleic acids may also be used to identify genes 
associated with a particular phenotype. such as hereby' <s 
itary disease or drug response;, •■" ' '. ^ 

3: Use of EST-retated nucleic acids; positional ■ 
segments of EST-related nucleic acids tar fragments df 
positional segments of EST-related nucleic acids Gene so 
Identification • ■ - : - ';-v ■* : S.l ^ y,\\ - 

EXAMPLE 45 < " - v • S \ ; . - . . 

Identification of genes associated with hereditary - ss 
diseases or drug response . . . - 

[0395J this example illustrates ah approach useful for ■ 



the association of EST-retated nucleic acids, positional 
segments of ESpretated nucleic acids or fragments of 
positiohal segments of EST-related nucleic acids with 
particular phenotypic characteristics. In this example, a 
particular EST-retated. nucleic acids, positional seg> 
ments of EST-retated nucleic acids or fragments of po- 
srtional segments of. EST-related nucleic acids is used { 
as a test probe to associate that EST-related nucleic ac- 
ids, positional segments of EST-related nuclei acids or: 
fragments of positional segments of EST-related nucleic 
acids with a particular prienc^ypc ct^ 
[0396] EST-related nucteic acids,, positional seg- 
ments of EST-related nuc^ic ackis or fragments of 06- ; 
sitional segments of EST-related nucleic acids, are 
mapped to a particular location on a human chromo- 
some using techniques such as those described in Ex- 
amples 41 and 42 or other techniques known tn mje art 
A search of Mendelian Inheritance in Man (V. McKustek, 
Mendelian Inheritance in Man (available on line through • s 
Johns Hopkins r ^ Medical Library) r^ 

veals the region of the human chronr»some which con- *; 
talis the EST-related nucleic acids; positional segments 
of EST-related nucleic acids or fragments of pos'rtidnal 
segments of EST-related nucleic acids to be a very gene 
rich region containing several known genes and several 
diseases or phehotypes f^ have not been ^ 

identified. The ge^^ 

nucteic acids, positional segments of EST-related riu- ; 
cleic acids or fragments of positional segments of EST- 
related nucleic acids thus becomes an immediate can- 
didate for each of these genetic diseases. : ' 
[0397J Celts from patients with mese diseases or phe-^ 
notypes are isolated and expanded *in .culture:* PCR 
primers from the EST-related nudeic acids; positional * s 
segments of EST-f elated nucleic acids or fragments of 
positional segments of EST-retated nucleic acids are > 
used to screen genomic DNA, mRNA or cBNA obtained' p 
from the patients. EST-relaled nucteic acids, positional 
segments of EST-retated nucleic acids or fragments of: 
positional segments of EST-retated nucleic acids that ; 
are not ampTrfted in the patients can be positively assoSC 1 
dated with a particular disease by further analysis. Al- 
ternatrvety, the PGR analysis may yield fragments of d&f- < ; 
ferent lengths when the samples are derived from an 
individual having the r>henotype associated with the dis- 
ease than when the sample is derived from a healthy " 
bcfividual, incficating that the gene containing the EST- ; 
related nucleic acids, positional segments of EST^retat- * 
ed nucleic acids or fragments of positional segments"©! 
EST-related nucleic acids may be responsible for. the ■ 
genetic disease:. >. - m 

VI. Use of EST-f elated nucleic acids; positional j . 
segments of EST-retated nucleic acids or fragments ; 
of positionat segrnents of EST-related nucleic acids - 
to Construct Vectors 

[0398] The present EST-related nucleic acids, posk 
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tipnai segments of EST* elated nucleic acids or frag- 
me^s of positional segm^ : 
ids may also be used to construct secretion vectors ca- 
pable of directing the secretion of the proteins encoded 
by genes ttieretn. Such secretion -vectors may facilitate s 
the purificatipn or enricrtment of the proteins encoded 
by genes inserted therein by. reducing the number of 
background ,pro1eins^.|rcfn which the,; desired protein 
must be purified or enriched. Bcemptary secretion vec- 
tors are described in Exarr^te io 
* .. - 

1. Construction of secrefion vectors - 

EXAMPLE 46 ■ ■ , 

Construction of Secretion Vectors 

[0399] The secret^ 

include a promoter capable of directing gene expression 
in the host cell, tissue, or organism of interest Such prot *° 
moters include the Rous Sarcoma Virus promoter, the 
SV40 promoter, the human cytomegalovirus promoter, 
and other promoters familiar to those skiHed n theany 
[0400] A signal sequence from one of the EST-related 
nucleic acids, j>ositiona! segments of EST-related nu- & 
deic acids or fragments of positional segments of EST- 
retated hudeic acids is operabry finked to ^ promoter 
such that the mFW A trai^ribed from the prpmoter wiH 
tfireet the translation of the signal peptkJe. Preferab 
the signal sequence from one of the nucleic acids of! 30 
SEQ 10 NQs. :24-41 00. The host cell, tissue, or organ- 
ism fn^^jary.c»ftv1issua. or organism which recog- 
nizes the signal. peptide encoded by; the signal se- 
quence in the EST-related nucleic acids; positional seg- 
ments oT r €STHrelated nucleic ackfe or fragments of po- 35 
sitwnalsegrt^tsofEST^elate^ nucleic acids. Suitable 
hosts irKrfude mammafian c^ 
avian celts, tissues, or organisrns, irisect cells, tissues 
or organisms, or yeast ., . . 
[0401] In addition, the sWetion vector contakis 
ing sites to inserting gene^ er^ 
are to be secreted. The ctontng sites facilitate the cloth 
ing of the insert gene in frame wrth the signal sequence 
such that a fusion protein in which the signal peptide is 
fused to the protein encoded by the inserted gene is exr 4S 
pressed from the mRNA transcribed from the promoter. 
The signal peptide directs the extracellular secretion of 
the fusion protein. .*..;••>• . - 

[0402] Ttesecretion vector may be DNAorRNAand 
may integrate into the chrornosonw of the 50 
bty maintained as an extrachrc^nosornal repficoh in the 
host; be an- artificial chronxtsome, or be transiently 
present in the host Preferably, the secretion vector is 
maintained in multiple copies In each host cell. As used 
herein, multiple copies means at least 2, 5, 10, 20; 25, & 
50 or more than 50 copies per cell In some embodi- 
ments, tie muttiple copies are maintained extrachromor, 
somafly. In other embodiments, the multiple copies re- 



suft from amplification of a chroowsomal sequence. 
[0403] Many nucleic acid backbones suitable for use 
as secretion vectors are known to those skilled in the 
art, including retroviral vectors, SV4Q .vectors. Bovine 
Paptlkxna Virus vectors, yeast integrating plasmids, 
yeast eptsornal plasmids, yeast artificial chrc<npscfnes, 
human artifttial chrc<riosonries, P element vectors, bac- 
utowrus vectors, or bacterial plasmids capable of being r 
transiently introduced into the host . , 
[0404] The secretion vector apotyA,- 
signal such that the poly A signal is located obwnstream 
of the gene inserted into trie s^retipn vector ... 
[0405] v After the gene encoding the protein f^whfch 
secrettonis desired is inserted «tp the secretion 
the secretion vector is rttroduced hto trie host cell, tis- 
sue, or organism using calcium phosphate precipitation, , 
QEAE-Dextraiv : electroporatipn, . lippspme-mediated 
transfection, viral particles or as naked DNA The prpr 
tein encoded by the inserted gene is then purified or en- 
riched from the supernatant using conventional tech- 
niques such; as ammonium sulfate precipitation, immu- 
noprecipitation, ffnmunoafTuiitychrc^tp^aphy,, .size 
exclusion chromatography, ion exchange chrcntttogra- 
phy. and HBLC. Alternatively, the secreted protein may 
be in a sufficiently enriched or pure state in the super- 
natant or growth media of the host to permit ft to be used 
tor fts intended purpose without further enrichment 
[0406] The SKjnalsep^encej^may also be inserted inwi 
to yectois designed for gene therapy In such vectors, ^ 
the signal sequence is operabry linked to a promoter 
such that mRNA transcribed from the promoter encodes • 
the signal peptide. A cloning site is located pbwnstream ■ 
of the signal sequence such that a gene encoding a pro- 
tein whose secretion is desired may readily be inserted 
intothe vector and fused to the signal sequence; The ) 
vector . is introduced into an appropriate, host cell. The 
protein expressed from the prompter is secreted extra- 
ceDularty, thereby producing a trieiapeutic effect- 

EXAMPLE 47 

Fusion Vectors 

[0407] The EST-related nucleic acids, positional seg-: 
ments of EST^elated nucleic acids or fragrnents of po- 
sitional segments of EST-related nucleic acids rnay be 
used to cpnstnjct fusion yectc<s_f^ 
chimeric fjolypeptjdes. The chimeric rxtypeptides com- 
prise a first polypeptide portion and a sec^^ j^ 
tide portion. In he fusion vectors of the present inyeiri- 
tion, nucleic acids encoding the first polypeptide portion 
and the second polypeptide portion are joined in frame; 
with one another so as to generate a nucleic acid en- 
coding the chimeric polypeptide. The nucleic acid en- 
coding the chimeric polypeptide is operaNy linked to a 
promoter which directs the expression of an mRNA en- 
coding the chimeric polypeptide. The promoter may^e 
in any of the expression vectors described herein includ- 
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ing those described in Examples 20 and 46. 

[0408] Preferably, the fusion Vector ts maintained in 

multiple copies in each host cell! In some embodiments, 

the nrnjnipWcbpies ar e maintained ex^achrornosbmatfy. 

In other emixxfrnents, the multiple copies result from s 

ampUficatbri of a cty 

[0409] The first p^ 

of the pbf^>eplides ehaxfed by the EST related nucleic 
acids, positional segments of nucleic acids 

or fragments of po^oh^l se^ents of ESf -related nu^ . io 
cteic ackte; In some efnbbdimeiite t : the first polypeptide 
portion maV be one of the EST-retated polypeptides, 
fragrt^is of ^ seg- 
ments of EST-related fk>lypeptfd^j or fragments of pc* 
sftiortal segments of EST-relatedFK>^j^t»des V ' is 
{0410] The second polypeptkJe portion rnay comprise 
any polypeptide of interest. In some embbblrtkertts, the 
second polypeptide portion may comprise a polypeptide 
• having a detectable enzymatic activity such as green flu- 
orescent protein cVf}'^ 20 
tides in v^ich ^ 

a detectable polypeptide may bemused to 'xteternim-triia 
tntrac^lu^PkxsEdt^b^ of the first polypeptide portion! 
In such proee^ chi- 
meric polypeptide is introduced into a Host cell tinder 2s 
conditions which facilitate the expression of the chimeric 
pc^ypeptkie? Where a^ropriate, the celts are treated 
wfth a detection reagent which is visible tinder the mi- . 
croscope i bttbwirtga 

ble |pbly peptide aVtfJ thecellufer location of the detection &> 
reagent is determined £^ 

having a detectable enzymatic activity is p galactose 
dase, the cells may be treated with Xgal. Alternatively, 
where the detectable polypeptide is directly detectable 
without the addition of a detection reagent, the thtracel- ss 
lular location of the chimeric polypeptide is determined 
by performing rn^^ in which the 

electable polypeptide is visible. For example, S the de- 
tec^bfe po^pebtide is green fluorescent protein or M 
modified Version thereof, microscopy l is i^rformed by 4° 
exjposing f thb host cells 16 light having an appropriate 
wavelength 1 fa cause the green fluorescent protein or 
modified version thereof to fluorelsce: y \ 
[0411] AfternatrveV; the secbr^ 
may comprise a polypeptide whose isolation, puriftca- *$ 
tionVbrenrichnienti^ 

isolation, purification, or ehricJirrtenf of the second 
poVl^piide' pbrtion may be achieved by performing the' 
frrvhimoaffihhy chron^ described 
below using, an tn^^ so 
body ofrecled against the first pbtypepttde portion cou- 
pled thereto; ' / /X*' ^ > f '•'■&< <:- ■ . ■ ' 
[0412] -Itie protelns errcoded by the EST-related hu^ 1 
deb acids, pbsilictfiai segments of EST-related nucleic 
acids c*fra^ ss 
ed n ucleic acids or the EST-relatetl polypeptides, frag- 
ments of EST-related polypeptides, positional segments 
of EST-related pbtyperMio\3s; or fragments of positional 



segments of EST-related polypeptides may also be 
used to generate antibodies as explained tn Examples 
20 and 33 in order to identify the tissue type or cell spe- 
cies from which a sample is derived as described tn Ex- 
ample 48 ? 

EXAMPLE 48 

Identification of Tissue Types or Cell Species by Means 
of Labeled Tissue Specific Anttoodies 

[0413] Identification of specific tissues is accom- 
plished by the vbualizalioh of tissue specific antigens 
by means of aritilxxly preparations according to Exam- 
ples 2to and 33 which are conjugated, directly of ukS^ 
rectly to a detectable marker Selected labeled antibody 
species bind to their specific antigen binding partner in 
tissue se^ions; cell suspensions, or tn extracts of sdliF 
ble proteins from a tissue sample to provide a pattern;" 
for qualitative or semi-qualitative interpretation. • ' ^ ■ 
[0414] : Antisera fdr^ese proce^ur 
tehity exceecfih^ and for 

that reason, antibodies are concentrated toa rng/ml lev- - 
el by isolation of the gamma globulin fraction, for exarn- 
pie, by i6h^xchange r chrc^t6^aphy or by armhoriium 
sulfate fractionation. Also; to provide the most specific 
antisera, unwanted antibodies, for example to common 
proteins, must fee removed from 
fraction, for example by means bf insoluble immunoab-; ■ 
sorbents, before the antibodies are labeled with 'the 
marker. Either monoclonal or heterologous antisera is 
suitable for either procedure. 

1. Immunohistochemical Techniques 

[0415] . Purified; high-tifer antibodies, prepared as de- 
scribed above, are conjugated to a detectable marker, - 
as described, for example, by Fudenberg; H ; Ghap;: 26 ■ ■ 
in: G^h 503 Cffnk^ Jfnnwnoh^ t 3^ Ecfc' Lange,-Lds* 
Altos7 <^rfdmia f/l Mb) 6r Rose,, er at ehap; 12 in: 
Methods' ^ Ed; John Wiley arid- 

Sons, New York (1980). - ^ '^''Vf ^ 

[0416] A^ffuoVesc^f iraft^ 
modamine, is preferred, but anta>6dies can also be la- 
beled with an enzyme that supports a cotor producing 
reaction with a substrate, such a^ IxHrseradish peiroxi-' 
dase: Markers can be added td tissue-bound antibody 
in "a second step, as descrfoed bekjw. Merretrvery. the 
specific ahtitissue aritfckxfies can be labeled with ferritin 
or other electron dense particles, and IcK^zation of the 
ferritin coupled antigen-antibody complexes achieved ■;- 
by means of an electron rruc^bscdpe. In yet another ap- . 
preach, the antibodies . are rao^olabeted; wittv for 
example * 2S I, and detected by oveiiaytng the antibody 
treated preparation' with photographic emulsion. ' ■ ^ 
[0417] Preparations to carry out :the procedure 
comprise monoclonal or polyclonal anybodies to a sin- • 
gle protein or peptide identified as "specific to'a tissue ' 1 
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type, tor example, brain tissue, or antbody preparations 
to several antigenicalty distinct tissue, specific antigens 
can be used in panels, independently or in mixtures, as 
required. 

[0418] Tissue sections and cell suspensions are pre- s 
pared for irnmunohistochemical examination according 
to common histological techniques. Multiple cryostat 
sectkxts (about 4 jirn, urifixed)pf the unknown tissue 
and known jcoriUol, a& mounted and each slide covered 0 
with different dilutions of the antibody pr^ ' «> 

tions oT known and unknown tissues should also be 
treated with preparations to provide a pos^ive control, 
a ne^iverontt;^^^ prenrnmune sera, and ■ 

a control for nonrSpecric staging, for example, buffer. 
[0419] . Treated iS^ions are s incubated in, a humid ts 
chamber for 30 min at room temperature, rinsed, then 
washed in buffer for 30^4§ rrmx Excess fluid is. Wotted 
away, and the ^ marker o^ejopedl 
[04201 If the tfe^ was npt labefecl 

in the first /h^^jpn, g^^Jabc^ at this time w\ a a> 
second antibc^r^ reaction, for example*, , by 
adding fluorescein- or e|>zyme-cpr^ antibody 
against the irnrmjrwo^ulin class otthe antiserunvpro- 
ductng species, for example, fluorescein labeled anti- 
body to mouse IgG. Such labeled sera are commercially & 
available. c : r. . T: , : . , ..•„.• ... : .s, rK ; lK . s , : .. ... 

[0421] Theantigeaf^^ the tissues by the above 
procedure can be qu^tified^ the intensity ;i 

of .^olorior fluqresceilce on the tissue section, and cafi- 
brating that signal using appropriate starid^rds. . 30 

2. Identification of Tissue Specific Soluble Proteins 

[0422] The visualization of tissue specific proteins 
and identffication of unknown tissues, from that proce^ , & 
dureWcsmedto^ 

and detection; strategy as ; described for jmmuncrfu^ 
chemistry; however the sa^le fe prepared according , 
to an elec^ophoreric4echn the proteins 

extracted fropnt the tissue in an orderV array on the basis 
of molecular weight for detection. ^ $ , ±, 
[0^123] A tissu e sample £ homogmjzed wsing a Virtis, ■ 
apparatus; cell suspensions are o^upte^ by Dcwnf e 
hpmogenization or osmotic tysts, using detergents in ei- 
ther case as required to disrupt ceO membranes, as is 
the practice in me arU Insoluble eel 
as nudeimte^^ jar© 
removed by uttracentrifoc^ 

contabung traction concentrated if necessary and re- 
served for analysi^. ,t^. . -^ ;= " 50 

[0424] A sample of the soluble protein sptutkpn r«r : 
solved jrito MPytdual protein species by .coin^tipral 
SDS pofyaaytamide electrophoresis as descrfeed, .for. 
example, tyDayis^^ 

od$ in Molecular Biohgy (R Leder, ed), Elsevier, New 
York (1 986); ustnga range of amounts crtpolyacrytamide 
in a set of gels to resolve the entire molecular weight 
range of proteins to be detected in the sample. A size 



marker ts run in parallel for purposes of estimating mo- 
lecular weights of the constituent proteins. Sample size 
for , analysis ts a convenient volume of, from ;5 to 55 pj, 
and containing from about 1 to j 100 u.g protein. An aliquot . 
of each of the resolved proteins is transferred^ blotting 
to a nitrocellulose filter paper, a process ttat rnatntatns 
the pattern of resolution. Multiple copies are prepared 
The procedure, known as Western Blctf Analysis, is weH 
de^rft^ *n t>a^s, L et at, supra Section t9-a One] 
set of nitrocellulose blots k stained^ 
Blue dye to visualize the entire set of proteins for c«m ' 
parison with i the antbody lx>urKlprp^ins. The remaining 
nitroceKuiose fitters are then incubated with a sdution 
of one or more specKtc antisera to tissue specific pro- 
teins prepared as described in Exarr^les 20 and 33. In 
this procedure, as in procedure A aboveV apprpp 
positive and negative sample and reagent controls are 

[0425] In eithc* procedure descrp^ k>oye a detec*- 
able label can be attached to ^ prirr^ ^ue a^ ; 
primary antibody ^complex according tpyaripus strate- , 
gtes and permutations thereof. In a strajoj^orward 
prpach, the primary speciff c antibody can be ladled; ah : 
tentatively^ the unlabeled epmpie^ can be bound by a 
labeled secondary ^nWgG antibody fnptt^ar^c^- 
es, either the primary or secondary anttoody is conju- 
gated to a biptin molecule, which can, in a subsj^ent 
step, bind an ayiiri conjugated yei 
another strategy, enzyme labeled or radk>active protein 
A, which W the pro|^ 

in . a final step to either the primary pr;seopncbry antir 

bOdy. . , ~ . -• :, ,.. ; .,: , • 

EXAMPLE 49 . ,-. 

Irrmiunohistcchemical Localization of Polypeptides 

[0426] The antftwdes prepared as o^BScra>ed ri ;Ex^ 
amples 20 and ^ above rnay be utilized b p>termpe 
th<b ^ cellular tocatkxv^ a polypeptkte The po^ype^ie 
may be any of the potyr^eptides encpded ^E^T^ 
nucleic acids, positional ^rn^r^ pt 
cleic ackls or fragments of positional segments of EST* • 
related nucleic acids i or the ppjypeptio^ ^y be one <rf[ 
meEST-re^tedpo^r^t'ides, fragments ^of ESTHrel^ted , 
polypeptides^ positional segment^; ^ of ^ST-rel^ted 
polypeptides, or fragments of pKjsitipnal segments-of 
ESJ-relafed rx>lypeptkies r In sor^.emtxxftT^te^ 
polypeptide may ,be a chimeric piotypept^.such.as 
those encoded by the f u^ipn vectors of Example 47 : 
[0427] Ceils expressing the polypeptide to. be lpical- ; 
ized are applied to a microscope slide and fixed using 
any of Ihe procedures t^icafly empk^ed h 
tochemk^l localizat^ inctucftig ^ the meth- 
ods descried si QunientPjpfocp^m 
John Wley and Sons, lr^. 1997. Foflowing a washing 
step, the cells are contacted with the antibody. In some 
embodiments, the antibody is conjugated to a detecta- 
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We marker as described above to facilitate detection. Al- 
tentatively, in some embodiments, after the cells have, 
been contacted vvith an antibody to the polypeptide to 
be tocafized, a secondary antftxxJy which Has been con- 
jugated to a detectable marker is placed in contact with s 
the antibody against the polypeptide to be focalized, ,v 
[0428] TOereatfteV, mbrosc^y is perfo 
conditions suitable (of visualizing the cellular location of. 
the pprypeptide. 1 / " r * ' 
[0429] r, ' The visualization of tissue sjfiecific' a^tigert io 
binding at levels above those seen in control tissues W 
one cr rm>re tissue specific 'antibodies, directed against 
the polypeptides encoded by EST-related nucleic acids, 
positional segments of EST-related nucleic acids or 
fragments c4 positional segments of EST-related nucleic i$ 
acids or anttoooie^ ^ agamsV the EST-related polype^ 
tides, fra^nents of EST-related pc>lypeptkJes, pc^itiorial 
segments c4 ESTnrelated polypeptides, c< fra of 
positional segments of EST-related poryr^tites, can 
identify lis^ forensic ' 20 

samples, or differentiated 
tasized to foreign bbdiry-srtes. " ' : 
[0430] n The antaiodies bf Example 20 and 33 may also 
bemused I in the ^ tdctv 
niques described below to isbtete, pW 25 
pcrfy^ptide^ encoded bythe EST Related nucleic acids; 
posrtior^l se^ents of EST-retaied nucleic acids or 
fragr^ViWqf 

acids or tb isolate, purify or enrich EST-related polypep- 
tkJes, fragments of EST-relaited polypeptides, posftibna! 30 
segments of EST^etated polypeptides, or fragments of 
. posrtional segments "of ESPrelate^ pb^^ 
irrimurraffiri^ techniques descrtoed 

below rnay also be used lb feotate, 1 puri^ or ehrtch 
polypeptides which have been tihketf to 'the 1 f&lypejp^ 35 
tides encoded by the EST* elated hucfeic acids, posi- 
tional segments i of ESTHrelat^d nucleic acids or frag- 
ments of p^it^ 

ids or tb is^te, purify or en^ have 
been linked to E§T-relat ed polypeptides, fragments of 40 
EST-retatedpbry^ 

related pbrypeptides, or fra^ehts of posrtional seg- 
ments 6f; EST^elated pblypepttdes: * r ' : ' 

EXAMPLE SO"*' -'45 

fmmunbaffin^ t; 

[0431]" Anybodies prepared as r described above are 
coupled to a *siippbri . Prele aire so 

morKctoriaJ antiTxxSes/ tiut polyclonal antibodies may 
also be used The support may be any of those typically 
employed in immurioaifo including 
Sepharose CL-48 (Pharmacia, Piscataway, NJ) t 
SepharoseCL-2B(Phan^^ NJ). Affi-gel = ss 

10 (Bbrad/F^rikirri ; 
[0432] The antibodies may be coupled to the support 
using any of the coupling reagents typically used in in> 



munoaffinity chromatography includtng cyanogen bro- 
mide, ASXer coupling the antibody to the support, the sup- 
port is contacted with a sample which contains a target 
polypeptide whose isolation, purification or enrichment 
is desired. The target polypeptide may be a polypeptide 
encoded by the EST-related nucleic acids, positional 
segments of EST-related nucleic acids or fragrrients 6f 
positional segments of EST-related nucleic acids or the 
target polypeptide may be one of the EST-related 
polypeptides, fragments of EST-related polypeptides^ 
positional segments of EST-related polypeptides, or 
fragments of positional segments of EST-related 
poty^ptides. The target polypeptides • may also be : 
polypeptides' which have been linked to the p^lypej^ 
tides encoded by "the EST-reiated riucleic acidsl p$sV 
tional segments of EST-related nucleic acids or frag- 
ments of pdsifchal segments of EST : relatetf nucleic ac- 
ids or the target' polypeptides may- be pbl^efrtides 
which' have been linked to EST-related ; ftor^^tides^ 
fragments of EST-related polypeptides, positional seg- ' 
merits of EST-related rx>ryperMides r ^ po- 
sitipnai se^rrients pi EST-related polyrjeptides using the 
fusion vectors described above! 
[0433] Preferably, me sample is placed in contact with " 
the support "for a sufficient amount of Vri'eahb* under 
appropriate conditioris to allow at least 50%^thetargjet ; 
polypeptide fo specificity bind to meahtibody cq^ted 
to the support: 1 v * 'f 1 - 1 

[0434] thereafter, the support is washed with an i ap^ " 
propriate wash solution to remove ^rypeptides which 
have horh^r^ecrficany adhered to the s^pbrt ihe wash * r 
solution may be any of tliose typically er^byeS in Irh- 
murwaTfihity chrorratbgra^^ PBS^tris-irthH 
um chloricie buffer (0.1 M lysine base and 0:5M lithiuifri; 
chloride, pH ftQ); tris4iydrochi6rid^ TrW- 
hydrochloride, pH 8.0), or Tris/Trftpn/NaCI buffer (&mM * 
Trjs.d,^H8:0br9d,0^ 

[0435] After washing, the specifically behind target 
polypeptide is eiuted from the support using the hlgh plH' " 
or low pH elution solutions typically ernplby^;fo"tt 
noaffoihy chrbmatbgiaf^ In 
lutions may contain an eluant such as tnet^^ 
diethylamine, calcium chloride, sodium thiocyanate, po- 
tasssium bromide, acetic'acid; w glycine. In some jBrrv- 
bodiments, the elution solution may also contari a de- 
tergent such as Triton :XNd0 or octyl-p^glucoside. : 
[0436] The EST-related nucleic acids, pdsitibnal se^ 
mehts of EST-related nucleic ackis ccfragniente of p6^' 
sitibnal segments of EST-related nucleic acids may also 
be used to clone sequences located upstream bf r the 
5*ESTs which are capable of regulating gene; expres- 1 
sion- BTcfudtng promoter ' sequences, enhancer se^ :! 
quences, arid other upstream sentences which ihflu: 
ence transcription or translation levels/ Once identified 
and clccied, these upstream regulatory s^uences may 
be used in expression Vectors designed to drect the ex- : 
pressioh of an inserted gene in a ^ desired spatiai, tem^ 
poral, developmental, or quantitative fashion. Example 
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51 describes a method lor cloning sequences upstream 
of the EST-related nucleic acids, positional segments of 
EST^elated nucleic acids or fragments of positional, 
segments of ESt-reiated nucleic acids. 

2/ fortification of upstream sequences with promoting 
or regulatory activities 

Use of EST-related nucleic acids, positional segments 
of EST-related nucleic acids or fragments of positional 
segments l ot ESt-retated nucleic acids to. Clone \ .. 
Upstream Sequences from Genomic DNA 

[0437] .. Sequences derived from EST-related nucleic 
acids, positional segments of EST-related nucleic acids 
or fragments of positional segments of EST-related nu- 
clefc ackfe may be used Jo isolate the promoters of the 
correspoiKfing genes using chromosome walking tech- 
niques^ In one^c^ornospme Walking technique, which 
utilizes the Geriome^lker^^kft available from Ckm- 
tech. five complete genomic DNA samples are each di- 
gested with a different restriction enzyme which has a 6 , 
base recognition site and leaves a blunt end. Following 
olgestion, digonudec^e adapters are tigated to each 
end 6f the resulting genomic DNA f ragments. . . 
(0438) For each of the five genomic DNA libraries, a . , 
first PCR reaction is performed accordirig to the manu- 
facturerViretnjctbns using an outer adapter primer pro- 
vided jn the kit and an outer gene specific primer. The 
geqe specific primer should be selected to be specie 
for 5* EST of interest and should have a melting temper- 
ature, length, arxi location in the EST-related nucleic ac- 
idsy positional segments of EST-related nucleic acids or 
fragments erf positional segments of EST-related nucleic 
acids .which is consistent with its use in pCR reactions. 
Each firs£ PCR reaction contains 5ng of genomic DNA, 
'5--|4-^10^-JjH reaction buffer, 0.2 : ar.ea^;dmF, 
0.2 pM each of outer adapter primer and outer gene spe- 
tffiq 0k^iA JiriM <rf M^OAb)^ . r C|in«^^_~9iii Gtf _jf^9_^rjatw-- 
pofymerase J50X mix in a total volume of 50 ui The re- 
action cycle for the first PCR reaction fe as follows: 1 
min sA$4°Q/ 2 sec at 94°C, 3 mb a\72*C (l cycles) / 
2secat94*C, 3miri at67°C (32cycles)/5 min at 67°C- 
[0439] =. ^product of the first PQR reactw is diluted 
and used as a template for a second PCR reaction ac- 
cording to the manufacturer's reductions usbg a pair 
of nested primers which are located internally ohthe am-, 
rjticon resulting from the first PCR reaction. For exam- 
~ pie, 5 pi of the reaction product of the first PQR reaction 
mixture may be diluted times. |^ are made 
in a 50 pi volume having a composition identical to that 
of the first PCfR reaction except the nested primers are 
used. The first nested primer is specific for the adapter, 
and is provided with fee GenomeWalker™ kit. The sec- , 
cod nested primer is specific for the particular EST-re- . 
fated nucleic acids, positional segments of EST-related 



nucleic adds or fragments of posftional segments ';of ,. 
EST-related nucleic acids for which the promoter is to 
be cloned and should have, a melting temperature, 
length, and location in the EST-re^ed nucleic acids, po- 

5 sitional segments of EST-related nucleic acids or frag- . 
ments of positional segments of EST-reteted nucleic ac- 
ids which is consistent With is yse In ROT reacUqns. 
Trie reaction parameters of the second PCR reaction, 
are as follows: 1 min at 94°C / 2 sec at 94*C. 3 mm at . 

J0 72 oi C j(6 cycles) / 2 sec at 94°C, 3 min at 67^C (?5 cyi?te>) 
/ 5 min at - 67°C. The product of the second PCRreacT 
tion is purified, cloned, and sequenced using standard 
techniques. ■ . ;; . . ; 

.. [0440] Alternatively, two or more human genomic 

*5 DNA libraries can be constructed by using two or more 
restriction enzymes. The digested genomic DNA is 
cloned into vectors which can be converted into single 
stranded, circular, or linear DNA: A biotiTyteted oligonu- 
cleotide comprising at least 1 5 nucleotides from the 

20 EST-related nucleic acids, positional segments of EST- 
related nucleic acids or f ragments of positional seg- 
ments of EST-related nucleic acids sequence is hybrid-: < 
feed to the single stranded, DNA. Hybrids between .the 
biotinytated oligonucleotide ,ahd the stng|e stranded 

£5 DNA containing the EST-r^ 

al segments of EST-related, nucleic acids or f rap^nejrrts 
erf positional segments of EST-related nucleic acid^^ 
isolated as described above, t^reafier, the, single 
stranded DNA containing tteESTrre^ 

30 positional segments of EST-related" nucleic ^cjp^ or : 
fragments erf positional segments of ESTVretated nudeic : 
acids, is released from the .beads and converted into 
double stranded DNA using a primer specific jot the 
EST-related nuclei ackte/p<^itkxial segments of EST^, 

35 related nucleic^ acids w fragments of positional seg- 
ments of EST-related nucleic acids or a primer corre- 
sponding to a sequence cr^uded in th e cloning vector 
The resulting double sUartded DN A is transformed into 
bacteria. cDN As containing the EST^eiate^.nucieic ac- c ,. 

<to kis, positional segments of EST-related nucleic acids or 
fragments ofoc^itic^l sec^ente of ESTHrelated nucleic 
acids are identified by colony pCR or. colony^ 
tion. 

, [0441] Once the upstream genomic seb^ences have 
45 been cloned and sequenced as described above, pro- 
spective promoters and Uansc^ion start sites wthin 
the upstream sequences may be identified by compar- 
ing the sequences upstream of the EST-related nucleic 
ado^, po>itiona^^ 
so or fragments of positional seg^^ 

oleic acids wito databases c^taining Ioxjot transcripr 
tion start sftes, transection factor binding sites, or pro- 
moter sequences. . ^ 
[0442] In addition, promoters in the upstream i se- 
55 quertces may be identified using promoter reporter vec- 
tors as ; described in Example 53. 
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EXAMPLE 53 

- Identification of Promoters in Cloned Upstream 
Sequences 

[0443] The genomic sequences upstream of the ESP 
related nucleic acids, positional segments of EST-retat- 
ed nucleic adds or fragments of positional segments of 
ESTreiate^J nucleic acids are cloned into a suitable pro- 
moter reporter Vector, siicji as the pSEAP-Basic, io 
pSEAP^nhaincer, pPf^al-Basib, pfigal-Enhancer, or 
pEGFP : 1 Promoter Reporter vectors available from 
Clontech. Briefly- each of these prornoter reporter vec- 
tors include muttipte cJonhg sites 
of arerxtfergeneencc^ 

such' as secreted alkaline phosphatase, p galactosi- 
dase t or green fluore&eni prot^ up- 1 J 

stream of the EST-retated nucleic acids, positional seg- 
ments of EST-related nucleic adds or fragments of po- 
sitbrial se^fne^ nucleic acids are in- 20 

serted mto the cloning sites upstrearri of the - reporter 
gene in both bifientatto^'anS Into an appro- 

priate host cell. The level of reporter protein is assayed 
arid compared to the' level dotained from a vector which' 
lacks an insert in the ictoning f site; the presence of an 2s 
. elevated egression level in the Sector cohtaoiin^ the 
insert with respect Id the control vector indicates the 
presence ol a promoter in the insert. If necessary, the 
upstream sequences c^ vectors which 

contain an enr^ 30 
from "Weak promoter sequeficea 'A significant level of 
expression above that observed with the vector lacking 
ah insert irSicates that a promoter sequence is present 
hi the inserted upstream sequence. 
[6444] Appropriate host ^ reporter 35 

ve^^s ^y be chosen based oh the results of the 
above b^scribed determ of expression patterns 
ctftheESt-re^ 

' EST-related nuctetc acids or fragments of positional 
segments of EST-related riucteicacids For example, if *o 

. the expression pattern analysis indicates that the mRNA 
cbVresix>rkJing to a particular EST^etated nucleic acids, 
positional se^nents of EST-related nuctec ackfs or 
f raiments of positional segrhents of ESPrelated nucleic 
acids is expressed in fibroblasts; th *s 
vector may be introduced into a human fibroblast cell 

line: ■ ; 

[0445] ^omoter sequences wfthtri the upstream ge^ 1 
mimic DNA may be further defined by constructing nest- 
ed deletions in the upstream DNA using conventional ' 50 
techniques such as Exonuctease III digestion. The re- 
sulting deletion fragments can be thserted into the pro-' 
moter reporter ve^^ whether the deletion 

has reduced or obfiteiat^ 

the boundaries of the promoters may be defined; If de- 55 
sired, potential individual regulatory sites within the pro- 
moter may be identified using site directed mutagenesis 
or linker scanning to obliterate potential transcription 



factor binding sites within the promoter individually or in 
combination. The effects of these mutations -on tran- 
scription levels may be determined by inserting the mu- 
tations into the cloning sites in the promoter reporter 
vectors. 

EXAMPLE 54 - 

Cloning arid identification of Promoters 1 ; 

[0446] Using the method described in Example"51 v 
above with 5* ESts/ sequences , upstream of several 
genes were obtained: Using the primer pairs GGG AAG 
ATG GAG ATA GTATTG CCT G (SEQ ID NQ:i 5) and 
CTG CGATGt ACA TGA TAG AG A GAT TQ (SEQ ID 
NO: 1 6), the promoter having the internal designation 
P13H2(SEQIDN^ 

[0447] : Using the primer pairs GTA CCA GGGG ACT 
GTG ACC ATT GC (SEQ ID NO: 18) and CT G TbACCA 
TTG CTC CCA AiSAGAG (SEQ IDNCX1 9)* the prornot^ I 
er havnigthe internal designation P15B4 (SEQ ID NO: 
20) was obtained. ' 1 : 

[0448] Using the primer pairs CTG GGA TG6 AAG V 
GCA CGG TA (SEQ ID NO:21 ) and GAG ACC ACA 
CAG CTA GAG AA^SEQ ID NO:22); the promote* hav- 
ing the internal designation P29B6 (SEQ ID NO:23) was- 
obtained. -•* ' L ' "" "' • ' ■ ■ ■ : x [*'■ &\ 

[0449] ' Figure A provides a schematic description of ; 1 
the promoters isolated arid the way they are assembled 
with the (^responding 5* tag>^The iipstream sec(uer^- * 
es were screened for the presence of motifs resembling 
transcription factor binding sites or known transcription 
start sites using the oomputerprc^ram Matlrispector re- 
lease 2.0, August 1996 % ; v; 1:-? 
[0450] Figure 5 describes the transcription factor :■ 
binding sites present in each of these promoters. They 
columns labeled rnatrice provides the name of the Mat- 
Inspector matrix used. The column labeled position pro^ 
vides the 5* position of the promoter site. Numeratfoh of r 
the sequence starts from the transcripttori site as deter- 
mined by matching me gerx>mic sequence with the 5* 
EST sed^e'nce/Tri© labeled •orientation - indi- 
cates the DNA stirar^ site is lour*b\ witfiv 
the + sUand being the coding strand as determined by 
matching the i p^rKxnk: sequ ence with' the sequence of 
the 5* EST. The column labeled "score? provides the; ; 
Matlnspector score found for this site. The column la^ : v 
beted ^h^' provides the length of the site tn nucle- " 
otides. The column labeled *sequer»ce" provides the se^ 
quence of the site found: 

[0451] Bacterial clones containing plasmids contain- ; 
wig the promoter sequences described above described-** 
above are' presently stored iri the inventor's laboratories 
under the internal identification riurribers provided 1 
above. The inserts may be recovered from the deposit- 
ed materials by growing an aliquot of the appropriate; 
bacterial clone h the appropriate medium; The ptasmid 
DNA can then be isolated using ptasmid isolation pro- 
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cedures famifiar to those skilled in the art such as alka- 
line lysis minipreps or large scale alkaline lysis plasmid . 
isotation procedures. If desired the plasmid DNA may 
be further enriched by centrifugalion on a cesium chlo- 
ride gradient, size exclusion chromatography, or anion s 
exchange chromatography. The plasmid DNA obtained 
using these procedures may then be manipulated using 
standard cloning techniques familiar to those skDted in 
the art. Alternatively, a PCR can be done with primers 
desired at both ends of the inserted EST-related ( nur to 
cleic acids, positional segments pfEST-related nucleic 
acids or fragments of positional segments of EST-relat- 
ed nucleic acids. The PGR product which ccf respprwJs 
tothe EST-related nucleic acids, positional segments of 
EST-related nucleic acids or fragments of positional 1* 
segments of EST-related nucleic acids can then be ^ma- 
nipulated using standard cloning techniques familiar to 
those skilled in the art ; .;• 

[0452] The prorrwters and other regulatory sequenc- 
es located upstream ot the EST-related nucleic acids, 20 
positional segments of EST-related nucleic acids or 
fragments of positional segments > of EST-related nucleic . 
acids may be used to design expression vectors capa- 
ble of directing the expression of an inserted gene b a 
desired spatial, temporal, cteyeloprnental, or quanlrta- . & 
five manner. A promoter capable of directing the desired , 
spatial, temporal, developmental, and quantitative pat- 
terns may be selected using the results of the expres- 
sion anafysis described above. For example, if a.pro- , 
moter which confers a high level of expression in muscle 30 
is desired, the prompter sequence upstream of EST-re- 
lated nucleic ac^, positional segrnents of EST-re t teted 
nucleic acids or fragments of positional segments of 
EST-related nucleic acids derived from an mRNA which 
are expressed at a high level in muscle, as determined 3$ 
by . toe methods abw^ , 
vector. . . .j-, .,• ■ ■ ^ 

[0453] Pref erabry, me desired promoter is placed near 
multiple restriction sites to facilitate the clewing of the 
desired insert downstream of the promoter, such Jhatthe , . 
promoter is able to drive expression of the , inserted 
gene. The prompter may be inserted in conventional nu- 
cleic acid r^ackbones desired tor extrachrcfnoscnal 
replication, integration into the .host chrc*nc^omes or 
transient , expression. . Suitable backbones for the «s 
present expression vectors . indude retroviral back- 
bones, backbones from eukaryotic eptsomes such as 
SV40 or Bovine PapOloma Virus; backbones f rpm bac- 
terial episomes, or artificial crHoriK>somes. 
[0454] Prefeiabry. the express^ 50 
a poryA signal downstream of the multiple jjestricticii 
sites for Erecting the polyadenylation of mRNA tran- 
scribed from the gene inserted into the expression vec- 
tor ..... ■ ■ ; v- 

[0455] Following the identification of prompter se- 55 
quences using the procedures of Examples 51-54, pro- ; 
teins which interact with the prompter may be identified 
as described in Example 55 below. 



EXAMPLE 55 

Identification of Proteins Which Interact w^i Promoter 
Sequences, Upstream Regulatory Sequences, or 
mRNA 

[0456] Sequences withnn the promoter region which * 
are likely to bind transcription factors may be ioentrfted 
by homology to known transcription facte* binding sites ,; 
or through conventional mutagenesis or deletion analy- 
ses of reporter ptasmids cohtaining the prcrr^er se-. ; 
quence. For example, deletion^™ in a rer ' 

porter plasmid containing the prprr^erset^ence of in- 
terest operably linked tp^;^ 
The reporter ptasmte carrv^ 
the prornpter reckon are.transf^ed mto an approprfefe , 
host cell and the effects of tiie deletions on expression 
levels is assessed factor binding s|es. : 

within the regions in whk^i detelions reduce expression 
levels may be further Ic^feed us'ffjg^site directed mu? 
tagehesis, loifcer scanning analysfe,cf p^ier techniques 
familiar to those stalled in the art, a{ 
[0457] .Nudeic ackfe erx^^ 
with sequences m the promoter may be ic^tifie^ us&ig 
one-hybrid systems such as those described p ti)e rrian^ ^ 
ual accompanying the MatchrnakerOn^^ 
kit ^av^^/irpof- p^tech (Catalog He i ^6^i 
Bn>%, me 1^ used as ? 

follows. The target sequence for whfch it ^ paired, to . 
identify binding pr^eir^ is ckx}^ 
ble reporter gene andintegrat^into 
Preferably, multiple copies of the target se^yencps are 
inserted into the reporter plasmid b> ( tendem. A Bbrary , 
comprised of fusions between cDN to be evaluated . 
for the abiOty to bind to the pro^ foe actrvation • 

domain of a yeast Ji^ifisc^^ ^ QAL4, is\ 

transformed into the yeast straincoritaintng the integ^at^ , 
ed repeater sequence. The yeast are plated on selective 
media to select cells expressing thp selectable rri^er , 
linked to ^ prom the co^ie§ whj&t? 

grow on the selective media contain genes.pnc^ing ; 
proteins which bind^e tarj^ The insertsjh: 

the^esencbdbg 

acterized by sequencing, tn addition, ^tl* ! in^^s i^^^ 
inserted into expression vectors xx in vitrnW^t^pi} 
vectors. Binding of the polypeptides encoded by the 
serfs to the prompter DNA may be confirmed by tech-. , 
niques famiTiar to those skilled in the art, such as gel 
shift analysis or DNAse protectiori analyst 

VII. Use of EST-celated nucleic acids, pc^itipiial ; L 
segments of EST-related nucleic acids or fragments 
of positional segments of EST-related nucleic acids 
in Gene Ther apyr 

[0458] The Resent invention also cwn^es the use 
of EST-related nucleic acids, ppsmonal seg^prrts of 
EST-related nucleic acids or fragments of posilional 
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segments of EST-related nucleic acids tn gene therapy 
strategies, including antisense and triple helix strategies 
as described in Examples 56 and 57 below. In antisense 
approaches; nucleic acid sequences complementary to 
an mRNA are hybridized to the mRNA intracellularty, s 
thereby blocking the expression of the protein encoded 
by the mRNA. The antisense sequences may^preveht 
gene expression through a variety of mechanisms. For 
example, the antisense sequences may inhtoit the abil- 
ity of rbosomes touarisiatetterr^A. AKer^ the 10 
antisense sequences* may block transport of the mRNA ■ 
from the nucleus to the cytoplasm, thereby limiting the 
amount of mRNA available for translation. Another 
mechanism through which antisense sequences may 
inhibit gene expression^ is by interfering 1 with mRNA 1S 
splicing. In yet another strategy, the antisense nucleic 
acid may be incorporated in a rbozyme capable of spe- , 
cifically cleaving the target rriRNA. 

EXAMPLE 56 • 20 

Preparation arid Use of Antisense Oligonucleotides 

[0459] The antisense nucleic acid molecules to be . 
used in gene therapy may be either DNA or RNA se^ & 
quences. They may comprise a sequence complemen- 
tary to the sequence of the EST-related nucleic acids, 
positional segrnehts of* EST-related nucleic acids or 
fragments of positional segments of EST-related nucleic 
acids. The antisense nucleic acids should have a length so 
and melting temperature' sufficient to permit formation * 
of an intracellular duplex with sufficient stability to inhibit 
the expression of the mRNA in the duplex: Strategies' 
for designing antisense nucleic acids suitable for use in 
gene therapy are disclosed in Green etai, Ann. Rev. ss 
Bhcteni ? 55:569-597-0 986) and Izant and Weintraub, 
Ceff 36:1007-1015 (1984)^ 

[0460] In some strategies, antisense molecules are' 
obtained from a nucleotide sequence encodinga protein 
by reversing the orientatkxi bt^ coding region wft <o 
spect to a promoter so as to transcribe the opposite 
strand from that which is normally transcribed in the cell. - i 
The antisense molecules may be transcribed using in ' 
vitro transcription systems such as those which employ 
T7 or SP6 polymerase to generate the transcript An- 45 
other approach iwbfves transcription of the antisense 
nucleic acids in vivo by operably linking DNA containing 
the antisense sequence to a promoter in an expression 
vector. r , v /■«.-. 

[0461] * Attenr*ativety/crfigjCHiucleotides which are corn- so 
plementary to the strand normally transcribed in the ceO 
may be synthesized irrvftfo. Thus, the antisense nucleic 
acids are complementary to trie corresponding mRNA ■ 
and are capable of hybridizing to the mRNA to create a 
duplex. In some embodiments, the antisense sequenc- ss 
es may contain modiTied sugar phosphate backbones 
to increase stability and make them less sensitive to : 
RNase activity. Examples of modifications suitable for 



use in antisense strategies are described by Rossi et 

[0462] Various types of antisense oligonucleotides * 
complementary to the sequence of trie EST-related nu-' 
cleic acids; positional segments of EST-related nucleic ; 
acids or fragments of positional segments of EST-relat- 
ed nucleic acids may be used. In one preferred embod- 
iment, stable and serrtHstabte antisense ofigonuclcp 
otides descrfoed in International Application No. PCT - 
W094/23026 are used. In these molecules, the; 3" end 
or both the 3* and SVends are engaged in intramolecular > 
hydrogen bonding between complementary base pairs.v 
These molecules are better able to withstand exoriuclev. 
ase attacks and exhtort increased stabflrty compared to .* 
conventional antisense oligonucleotides. ' • • • * 
[0463] In another preferred errdxxftment^'the^afUk i 
sense oOgodeoxynucleotides against herpes simplex vi- > 
rus types 1 and 2 described in International Application 
No. WO 95/04141 are used. ^ -r : ^ 

[0464] In yet another preferred embodiment, the cbv- : : 
alently cross-linked antisense oltgonucleotides der 
scribed in International Application No: WO 96/31523 
are used, these double- or single-stranded otigpnucle^ 
otides comprise one or more, respectively, inter- Or intra4 - 
oligonucleotide covalent cross-linkages, wherein fjjio v 
linkage consists of an amide bond between a primary > 
amine group of one strand and a carboxyl group of tt>e : : 
other strand or of the same strand, respectrvefy. tfie prn i 
mary amine group being directly substituted in the 2;po^ 
sition of the strand nucleotide monosaccharide ring?and ; 
the carboxyl group being carried by an aliphatic spacer ; 
group substituted on a nucleotide or nucleotide analog , 
of the other strand or the same strand, respectively, 
[0465] The antisense crii^ 
gonucleotides disclosed in International Application No. 
WO 92/18522 may also be used. These molecules are ; 
stable to degradation and contain at least onetianscripr' 
tion control Tecognition sequence which binds to control 
proteins and' are effective as decoys therefor. JFhese i 
molecules may contain "hairpin - structures; •q^imbbell" , 
structures, "modified dumbbell V structures, rVrossr ^ 
finked* decoy stnx^ 

[0466]< :<n another preferred embodiment, the cyclic 
oVxiWe-stranded olig/Dnucleotia^ d 
an Patent Application No. 0 572 287 A2.4r?ese ligated ■} 
oligonucleotide 'dumbbells' contain tt^e bho^sitefor 1 
a transcription factor and irthibit expression of the g^ne 
under control of the transcription factor by sequestering 
the factor. *. , :> \J> 1 * 

[0467] Use ctf the ck«ed antisense^ 
disclosed in fritemational Application No. WO 92/1 9732 
is also contemplated. Because these molecules have 
no free ends, they are more resistant to degradation by- ; 
exonucleases thian are conventional ofto/xiucleotides:'. 
These oligonudeptides may be multif uiictional; interact-, 
ing with several regions which are not adjacent to the 
target rhFH^A:.« ' • 

(0468] The appropriate level of antisense, niicjeicac- . 
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ids required to inhibit gene expression may be deter- 
mined using in vitro expression analysis. Hie antisense 
molecule may be inUoduced into the cells by diffusion, 
injection, infection or trahsfection using procedures; 
known in the art: For example, the antisense nucleic ac- s 
ids can be introduced into the body as a pare or naked 
c4igonuclec4jde, oligonudeot^ in lipid, ; 

oligonucleotide sequence ericapsidated by viral protein, 
or as an oligonucleotide operabty finked to a promoter 
contained tn an expression vector. The expression vec- 10 
tor; may be ainiy of a variety cl expressk)nvectors;krw>wn : : 
in the art, inctutfing retroviral or viral vectors, vectors ca- 
pable of ,exuachronk>sornal replication, or integrating 
vectors, trie vectors ji^ t^ DNA or RNA. 
[0469] The antisense molecules are introduced onto 15 
cell samples at a number of different concentrations 
preferably between 1x10-i°M to IxiOrfM. Once the mirv-. 
imum concentration that can adequately control gene; 
expresskxi is Mentr^ 

into a dosage suitable for use ln vAo^ 20 
inhftMttng concentration in culture of 1x1 0*7 translates in- 
to a dose erf approximately 0.6 mo/kg bodyweight Lev- 
els of dig/xHicteotide approaching 100 mtgfcg bcxiy- 
weight or higher maybe rxx^le after testhg tfieitoxfcily. 
of the oligonucleotide in laboratory anomals. It is addk ; & 
tionatty ccritemplated that cells from the vertebrate are 
removed,* treated with, the antisense oligonucleotide, 
and reintroduced into the vertebrate. : ^ 

[0470] It fefurther contemplated that the antisense ol- 
igonucleotide sequence is incorporated into a r&ozyme &> 
sequence to enable the antisense to specifically bind, 
and cleave its target mRN A. For technical applications 
of ribbzyme and antisense ofigonucleotides see Boss! 
et a/., supra. ■ • ••••• 

[0471] * ^ In a preferred application of this invention, the as 
polypeptide encoded by the gene is first rctentified, so 
that the effectiveness.^ 

tion can be monitored using techniques that include but 
are not limited to antibody-mediated tests such as HI As 
and EUSA, functional assays, or radiotabefing . 
[0472] the EST-related nucleic adds, positional seg- 
ments of EST-related nucleic acids or fragments c4 po- 
sitional segments of EST-related nucleic acids may ateo : 
be used in gene therapy approaches based on intracel- 
lular triple helix formation 45 
are used to inhibit transcriplK^ They 
are particularly useful for studying alterations in cefl ac- 
tivity as it is associated with a particular gene. The EST- 
related nucleic acids, positional segments of EST-retat- 
edriucteicackfeorff^^ so 
EST-related nucleic acio^ erf the present invention or, 
more preferably, a portion of those sequences, can be 
used to inhibit gene expression in iruiMduals having dis- 
eases asscciated wim expression of a parties 
StmBarfy, the EST-related nucleb acids, positkxial seg" 55 
ments of EST-related nucleic acids or fragments of po^ 
sitional segments of EST-related nucleic acids can be 
used to study the effect of inheriting transcription of a 



particular gene within a cell. Traditionalry, horrKxpume 
sequences were considered the most useful for trpfe 
hefix strategies. However, homppyrimidine sequences 
can also inhibit gene expression, Sue* hcfrKx^rruofrie 
oligonucleotides bind to the major grooye at homopu- 
rine:lKxirtopynrrudvie .sep^ences,; 1 Thys j; both types of 
seo^leiK^ from the EST-reta^^ 
al segments of EST-related nucleic acids or fragments 
of positional segments of, EST-reterted nucleic acids are*- 
contemplated within the scope of this invent^ V 

EXAMPLE 67 \.: ; , : >;> \>,4<-u ..- 

Preparation and use of Triple Helix Probes > 

[0473] ■ The sequences of the EST^elated nucleic ac- 
ids, positional segments of EST^efated nudeic acids or 
fragments of positional segments of EST-related nucleic 
acids are scanned to identify 1 0-mer to 2uHrwhornppy- 
rimkftne or homopuiine stretches which could be used 
in triple-helix based strategies^ 
pression. Following identification of candidate hornopy- 
rirnidirte or hornopurine stretches, their effideray ;in in? ?• 
hibiung gene expression is assessed by introduce var- 
ying amounts of ofigonucleotides con 
date sequences into tissue culture cells which normally ; 
express, the target gene. The oTigpnuctecifcies rnaybe ; 
prepared on an oligonucleotide synthesizer or they, may 
be purchased commercially from a company, speciafiz- 
ing tn custom oligonucleotide synthesis, such as - 
GENSET, Paris, France. ;< - & <„• , u*< & 
[0474] The oligc«ucleotide^; may be introduced into * 
the cells using a variety of methods* known : to those ^ 
stalled in the art, including but not limited Jo calcium 
rjhosphate precipitation, DEAE-Dextran; :dectfopora-L 
tion, r^x>sorne-rnediated transfection or native uptake.0 
[0475] •' Treated, cells, are monitored lor altered^ce^ 
function or reduced gene expression using techniques 
such as Northern Wotting. RNase protection assays, prj 
PCR based strategies to monitor the transcripHon^ey^ 
of the target gem fo^ 

the ofigcfiucleotide: The cell turKtonstp be rrxxi^ed; 
are predicted based upon the rKxrolcKjtes target, 
genes corresponding to the EST-refated nu^ic acxls,V 
positional segments of EST-related nucleic apds pr> 
fragment of positional segments c4 EST-rela«edtiuc|eic 
acids from which the c^g/mucleotide were qeriy^ wi?i j 
known gene sequences that have been associated with 
a particular function. The cell functions can also be,pre>1 
dieted based on the presence of abfK>m^l physiologies 
within cells derived from indrviduals with a fjarticutar ffi- 
herited <fijsease, particularly when tie E3trfelated nu-v 
defc acids, >pc«i^^ 

acids or fragments of positional segments of EST^relat4> 
ed nucleic acids are associated with the disease using 
techniques described herein - 
[0476] Theofip/tttx^kiesw^ 
htbiting gene expression in tissue culture cells may then 
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be introduced in vivo using the techniques described 
above and in Example 56 at a dosage calculated based 
on the in vitro results, as described in Example 56. 
[0477J In some t ernboo^nerrts. the natural (beta) ano-,* 
mers of the ofigoYnicteottde units can be replaced with : 5 
alpha anorners to render die oligonucleotide, more re- 
sistant 1 to nucleases Further, an intercalating; agent 
such as ethidium bromide, or the like, can . be attached 
to the &end of the a^ha o^^ 

triple helix. . For information on the generation of ofigo-. 10 
nucleotides suitable for triple helix formation see Griffin • 
etal (Scierice 245:967-971 (1989)). 



EXAMPLE 58 . - • 

Use of ESTrrelated nudetc acids, positional segments - 
of EST-retated nucleic acids or fragments of. posit tonal ; 
segments of EST-retated nucleic acids to express an : 
Encoded Protein in a Host Organism 



is 



20 



[0478] The EST-related nucleic acids, positional seg- 
ments of ESPrelated nucleic acids or fragments of po- 
sitional segments of ESTrretated nucleic acids may also 
be used to express an encoded protein or polypeptide 
tn a host organism to produce a beneficial effect. In ad- 25 
dition, nucleic acids encodrig the EST-related polypep- 
tides, position segments of EST-retated polypeptides . 
or fragments of -positional ^segments ol ESTrrelated 
polypeptides may be used to express the encoded pro- 
tein or polypeptide in a host organism to produce a ben- . &> 
eficial effect. :■■ < . . . ; . 

[0479] : In isuch procedures, the encoded protein or 
polypeptide may be transiently expressed in the host or- 
ganism or stably expressed in the host organism. The v, 
encoded protein or polypeptide may have any of the ac- s;3S 
tiviues descrbecl r above T^ encoded protein or- , 
polypeptide may be a protean or- polypeptide which the. 
host organism lacks or, alternatively, the encoded pro- 
tein may augment the existing levels of the protein in the 
host organism! , - . : r . .-j.'-.: . 40 

[0460] : In some erxibotfrnehts in which the protein or *v 
polypeptide Js secreted, nucleic acids eraxtfing the lull , 
length protein (i.e. the signal peptide and the . mature 
protein), or nucleic acids encoding only the mature pro- 
tein^, the protein generated when the signal peptide *$ 
is cleaved off).is intrc^iuced into the host organism. % 
[0481]: Jhe nucleic acids encoding the proteins or 
polypeptides may be inUoduced into the host organism 
using a variety of techniques known to those of skill in . 
the art. For example, the extended cDNA may be jnject- so 
ed onto the ho^ ^ 
encoded protein, is expressed in the host organism, 
trraeby procU^^ 

[0462] MematrveV^ lhe nucleic acids encoding the 
protein or polypeptide may be cloned into an expression & 
vector downstream of a promoter which is active in the 
host organism, the expression vector may be any 61 the 
expression vectors designed for; use in gene therapy, 



including viral or retroviral vectors. The expression vec- 
tor may be cfirectJy introduced into the host organism . 
such that the encoded, protein is expressed in the host 
organism to produce a beneficial effect In another ap- 
proach, the expression vector may be introduced hto 
ceils in vitro. Cebs containing the expression vector are 
thereafter selected and introduced into the host organ- 
ism, where they express the ,\ encoded protein or 
polypeptide to > produce a beneficial effect , 

EXAMPLE S9 . : i'{ 

Use of Signal Peptides To Import Proteins Into Cells C :V * 

[0483] The short 

peptides encoded by the sequences; of SEQ ^ jp NC^s: 
24-652 and 3721-3811 rr^also be used as a carrier to 
import a peptide or a protein of interest; so^calied cargo, ^ 
into tissue culture cells . (^ 

14225-14258 (1995); Du et a^ J^f^t^^^^u 
235-243 (1 998); Bojas et al. t r^/u/© 6^ecf> ; 16: 
370,375(1998)). v 

[0484] / When cell fjermeable peptides o( limited sfee ; . 
(approximately, up to 25 amino acids) are to be;trahsk>- : 
cated across cell membrane, chemical synthesis may 
be used in order to add the h regies to either The G>-ter-v 
minus or tfie N-terminus to the cargo peptide 'of interest =■ 
Altematiyery, when longer peptides pr ; pr oteins are tabe , 
imported into celts, nucleic acids can be c^eticany en 
gineered, usingtechniques familiar to ihose-skiUed; in ; ■ 
the art, in ordec to link the extended cDN A ^sequence 
encoding the h region to the 5* ; or the.^ end of a DNAr 
sequence coding for a cargo rxrfypepticte. Such genetK 
cally engineered nuclec acids are then translate^ either > 
invHroor in vivo after transf ection into a^ropriate cells/, ; 
using cc<iventic^al techniques to produce the resulting 
cell permeable polypeptide, ^urtable hosts cells are : 
then sirnpty incubated with the cell r^rme^le pg^ 
tide .which is then translocated across the niernbrane.;; 
[0485] This method may be applied to study diverse 
ritraceRutar functions and cellular rxpcessesV For t in- 
stance, it has been used '.to probe furH^kxiaUy relevant 
domains of intracellular proteins and to e^ 
protein interacjkx^s involved in signal transduction path-^; : 
ways (Lin et a£ .supra; tin et aL;i& JB^^ftfieriiLi 271: V, 
5305-5308 (1996); Rojas^ ef^al, 'JL Bhi Cfe^; 27*:?; 
27456:27461 (1996); Uu ef aL, Pto$jNa% A^ Sct, 
USA, 93: 11611 9-11824 |1 996); Rojas et '^Bt^&g'.{- 
phys. Resi Commwt, 234: 675H68Q (1597)): : . v ^ 
[0486] Such techniques may be used in celtujar ^erp,'; 



instance, cells isolated f rom.a; patient ma^be treated ?; 
with imported therapeutic proteins .and then reintro- 
duced into the rK>st organism. . / \ iy^: t i- . . > , 
[0487] Aflemativery, the h region of sjg^jl^tiq^ of: ,. 
the present invention could be used in cc<nWnaUcn 
a rniclear Iccafization signal to dofiYl^f ^l^P ^^ ^ 0 . 
cell nucleus. Such oligonucleotides may, be ahtisense ; 
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oftgonudeotides or oligonucleotides designed to form The pbiypepttde fragments ccrriprise atle^ 5, 10, 15; 

tr^le heSxes; as descrteedabw 20, 25, 30, 35. 40, 50, 75. 100, w 150 cons«^e amino 

processmg and maturation ol a target ceOuIar F^l acids of the polypeptides of SEQ ID NOs: 4101-8177. J 

.;.:*->. Preferably, urofragr^ 

EXAMPLE 60 ^ s appreciated that the potypep^ 

r.^v'-; f v NOs^410t-8177canberepre^ 

Computer Embodiments single character format or three letter fc^nnat (See the 

: --\ -- / .:;-.i : v «skJe ba<& cover of Ste^ 

[0486] As used herein the term "nucleic acid codes of — ediion: W H Freeman & Cto^ 

SEQ ID NOs: 24-4100 and 8178-36681 ■ encompasses £ *0 format which relates the identity of the polypeptides n < 

the nucleotide sequences of SEQ ID NOs: 24-4100 and a sequence. ^ (: > , 

8178-36681, fragments of SEQ ID NOs; ?4^100 vandu [0490] It wfll be appreciated by those skilled in the art 

8178^&81, nucleotide sequences torrKtogous to that tfie nucleic acid codes of SEQ ID NOs: 24^1 W^ridp 

SEQ ID NOs:24-4100 and 8178-36681 w hornotosK>us 8178-36681 and polypeptide codes of SEQ ID NOs: 

to fragments of SEQ ID NOs: 24^100 aixl 81 ^36681 ; ^ 4101-8177 can be stored^ 

and sequences ccrriplementary to all of the preceding on any rr^ can r^d antrt .; 

sequences. The fragments include portions of SEQ ID - r computer As usttf; herem^^ 

NOs:24-4100 and 8178-36681 comprising at least 10, "stored* refer to a pfw^ ^stor^ 

15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400; computer medium A skilled artisan can readily adopt 

or 500cor^ec4rtive nucleotides of SEQ ID NOs: 24-41 00 20 any of the presently krwwn me^ 

and 6178-36681. Preferably, the fragments are novel matbn on a computer readable medium to generate 

fragments: Homologous sequences and fra^nents df manufactures comprising one oc rrfore of the^r^ 

SEQ ID NOs: 24^100 and 8178-36681 refer to a sef id codes of SEQ ID NOs: 24-4100and 8178-36681, one 

quenco hai«ig at least 99%, 98%, 97%, 96%, 95%, or more oi the poVpeptkfo codes ; ^ 

90%, 85%, 80% or 75% rKxrwto&y to these sequences. & 4101-8177 Another aso^ 

Homology may be detemitned using any of the compu- a computer readable medium having recorded thereon 

ter programs arid par^ described in Example 18; at least 2, 5, 10; 15v20;25, 30, c^^uclefcacidc^xJes v 

including BLAST2N with trie default parameters or with of SEQ ID NOs: 24^4100 and 8178-36681; Another as^ 

any mbcfified parameters: Hcm)k>gous sequences also pectbf the present invention is a~cc<nputer readable m&- 

inciude RNA sequences in which uridines replace the dium having recorded thereon at least 2, 5, 10, 15, 20, 

thymines in the hucleio acid codes of SEQ ID NOs: 25, 30, or 50 polypeptide cedes of SEQ ID NOs:? 

24-4100 and 8178-36681 . Trie hornologous sequences 4101^177, • ■■>•<";■■ V; > " Wr :^.\;. ^ v j 1 

may be obtained us«g any of the procedures described [04911 Computer reaoablemed^ 

herein or^fhay resuft frorn theco^ ly readable media, optically readable meola; electron^ 

error as o^scr jt wffl be appreciated trim me & cally readable media and ma^ti^optic^ media: For 

nucleic acid codes of SEQ ID NOs: ; 244100 and example; the computer readable media may be a hard : j 

8178-36681 can be represented in me traditional single disc, a floppy disc, a magnetic tape; 1 CD-TOM. DVD, 

character lomiat (See the inside back cover of Starrier, :? r RAM; or ROM as wellas other types of ether media 

Lubert Bhcbbmsstry, 3** edition. W H Freeman &cC6^ known to those skilled in the art ^''^'^'^''t 

NewY6rfc!)c*fcanytf 40 [0492] fifTiboa^^ 

tity of the hdcleo^kies in a sequence. > --^ systems; particularly computer systems which contain 

[0489]' As used herein the term "polyipeptide codes of ^ the sequence information descrfcec) herein; As used 

SEQ ID NOs: 410^ herein, -fla [ computer system* ref^^ hardware 

tide seb^erice of SEQ ID NOs: 4101^1 77 which are eri- components, software components, ^d 

coded by the & EST s of SEQ ID NOs: 24^1 OO arid - & components used to an^^e the nufc*^ 

81 78-36681 , polypeptide sequences rkimologbus tothe of the nucleic acid codes of SEQ Id NOs: 24-41 00 -aid ? 

polypeptides of SEQ ID NOs: 4101-81 77, or f ragrnerits^ 8178-36681, or the amirk> ackl sequences of the 

of any of the preceding sequences. Homologous- polypeptide codes of SEQ ID^Nds: 4101^177 ^ew 

polypeptide sequences refer to a pc>lypej^kfo sequence computer system prelera^ computer • 

r>av%atl^ *> rea<fed)le rhe^ 

09%, 76% homology to one of the polypeptide sequent ? acc^ing and iri^ 

es of SEQ ID NOs: 4101-8177. Homofogy may be'*-? [0493] Preferably, ^ c<*^ 

termined using any of the cornputer programs ami pa- 1 system that comprises acerarat^ (GPU), 

rameters described herein, tn^ding FASTA wfth i ttte one or more data storage cdrnjaor^ 

default parameters or with any modified parameters. ss and brie or more data retrieving ^ 

Therioriricrtogo^ the data stored oh me dat£ ston^ 

of the procecb^ result fiom skffled artisan can reao^^ 

the correction of a sequencing error as descr toed above. currently available computer systems are suitable. 
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[0494] In one particular embodiment, the computer 
system includes a processor connected to a bus which 
is connected: to a main memory (preferably tmplement- 
ed as RAM) and one or more data storage devices, such 
as a hard drive arid/or other computer readable mecfia 
having data recorded hereon In some embodiments, 
the computer system further includes one or more data 
retrieving devices for reading trie data stored on the data 
storage corr^onehtsi'The data retrieving device may 
represent/ for example, a floppy disk drive, a compact 
disk drive, a i magnetic tape drive, etc. In some embodi- 
merits; the data storag>cornponent is a removable oonV . 
outer readable mett^ a flbppydisk.'a compact 

disk, a magnetic tape, etc: containing control logic anoV 
or data recorded'thereon. The computer system may 
advantageously include or be programmed by appropr^ 
ate software for reading the control logic and/or the data 1 
from the data storage component once inserted in toe ' 
data retrieving device. Software for accessing and 
processing the nu^ 

codes of SEQ ID NOs: 24^100 ahd : 81 76^366^1 ;6rthe 
arnino acid sequences of the pc^fieptide codes of SEQ 
ID''NQK\&l6l^177 :; j^rii as search tools, compare 
tools, and modeling tools etc ) rhay reside in main menv 
ory during execution. : *~ v - u . r ' ' s - " : 
[0495] "In some embodiments/ the computer system 
may further comprise a sequence comparer for cdrripar- 
irig the above-described nucteic acid codes of SEQ ID 
NOs: 24^t00 arid 8178-36681 or polypeptide codes of 
SEQ ID NOs: 4101-81 77 stored oh a coitiputer readable 
medium to reference nucleotide or polypeptide se- 
quences stored oh a computer readable medium A ■se- 
quence comjiiarer* refers to one or more programs 
Which are implemented oh the computer system to com- '■ " 
pare a nucleotide or polypeptide sequence with other 
nucleditfe^ ^ and/or conv 

pounds including but not limited to peptkJes/p^idomi- 
meiics, and chemicals stored within the data storage 
means. For example, the sequence comparer may com- : 
pare me nucleotide sequences of the nucleic acri codes 
of SE6 ID NOs: 24-41 00 and 8178-36681 , or the arnmo 
acid sequences of the polypeptide codes of SEQ ID 
NOs: 4101-8177 stored on a computer readable rnecfi- : 
urn to reference sequences stored on a computer read- : 
able medium to identify tornologies, motifs implicated 
in biological function, or structural motifs. The various/ 
sequence comparer programs identified elsewhere in 
this patent specification are parties 
for use in this aspect of the invention; : ; 

[0496] Accordingly, one aspect of the present inven- 
liori is a computer system comprising a processor; a da 1 
ta storage device having stored tr^eon a nucleic add 
code of SEQ ID NQs: ; f2^4t00 and 8178-36681 or a 
porypeptide code of SEQ ID NGs:< 4101 -8177, a data 
storage device having retrievabty stored thereon refer- 
ence nucleotide sequences or porypeptide sequences 
to be compared to the nucleic acid code of SEQ ID NOs: 
24-4100 and 8178-36681 or porypeptide code of SEQ 



ID NOs: 4101-8177 and a sequence comparer for con- 
ducting the comparison The sequence comparer may 
indicate a nephology level between the ser^ences'CorrU ; 
pared or identify structural motifs in the above described 

s nucleic acid code of: SEQ ID NOs: 24^4lQ0f 'and 
8178-36681 arid porypeptide cc^~of SK> ID NOs: 
4101-8177 or it may fortify "structure 
quences which are compared to these hudeicvacid 
codes and porypeptide codes. In some' ernbodirrierits, 

io the data storage device' may. have stored thereon the 
seijuehcesbi^^ 

the nucleic acid codes;of SEQ. ID : NOs: : 24r4100 arirj ¥ 
8178-36681 or jpolyr^ptide codes of VSEQ ID NOs: 
4101-8177: ' / 

'5 [0497] Another aspect of the present invention is a 
'". method forVjeterniirimg the level of rkxriofc^ between 
a nucleic acid code of SEQ'ib^l^^^lOO^cM-'' 
8178-36681 arid a reference nucleotide sequence, 
comprising the step^ of reading tr^ 

20 and the reference ruk^ mroutfvthe use 

! ofacorrtpute^ 

els and determinrtg homology between the nucleic acid 
cade arid the reference nucleotide seo^ence with trie 1 
computer program The corr^ 

2$ of a number of computer programs for determining Hr>: 
mology levels; irtciud^ 
herein; including Bt^ST2Nwrt^^ 
or with any modified parameters. The method may bef 
implemented using the computer systems desenbed 

so above. The- rriemcd may also be performed by reading ^- 
2, 5/10, 15, 20/25/ 30; or 50 of me abwe'descrftie^ 
nucleic acid codes of SEQ ID NOs: 24^4100 arid ; 
8178-36681 through use of the corrputer pro^ra 
determining homology betoeerVj^hucle*^ 

35 and reference nucleotide seo^jences \ " < * ^ - 

[0498] : Aftematrvery, the computer prc^ be d 

computer program 
quences of the riucfe^ 

toon, to reference nucteotkte sequences iri order to de- ^ 
40 termine whether the nucfeic acid code of SEQ ID NOs: 
: 24^100 arxl 8178-36681 r^ 

deic acid sequence at one or more positions Optionaily 
such a program records the length and identity of insert- 
ed, deleted or substftutedriucleotfdes with respect to the ~ , 
45 sequence of e^r the reference poryriucleotide or the ; 
nucleic acid ;codeVJof ^SEQ -ID , NOs: 244t00 and 
8178-36681- In one eriibpefra 
gram rriay bis a program which determines whether, the v 
nucleotide sequences of the riucferc acid codes of SEQ 
so jd NOs: 2*4100 and 8178-36681 contain a single nu- 
cleotide polyrnc^ 
ence nucleotide sed^ 

nxxphism may comprise a single base stibstitutioh, »v 
sertion/ or deletion: ■ • .'■ "■ >;■.._■»;.•■ ;"'•'* ! 

55 [0499] Another aspect of the present invention is a 
method tor detemthing me 1^^ 
a polypeptide code of SEQ: ID NOs: 4101.^8177 and a 
reference rxrfypeptide seo^erK», compristngfie steps 
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of reading the polypeptide code of SEQ ID NOs: 
410,1:6177 and; the reterence ^polypeptide sequence 
through use o( a computer-program which detenranes 
rwmqlogy levels and determining homology between 
^polypeptide code and the reference polypeptide se- , $ 
quench using the computer 

[0500] Accordingly, another aspect of the present in- 
vention is a method for determining whether a nucleic 
acid code of SEQ ID ^ NOs: 2^100 arKl8 . 
ffersatoneormor^^ *o 
otfcfe sequence comprising the steps of reading the nu- 
cleic acid code and the: reference nucleotide sequence 
through use of a computer program which identifies drf- ; 
ferences between nucleic acid sequences and identify- 
ing olfferencesjt^ nucleic acid code and the is 
reference nucleotide sequence with the computer pro-, 
gjanvlnsome^mbo^ computer program is 
a program which ip^n^es: single nucleotide polympr- 
phisms. The methpd may (^implemented by the com- 
puter systems described abpve. ; The method may, also 20 
be p^ormediby re^ 2, 5, 10, 15, 20, 25, 30, 
or 50 of the nucleic acid codes of SEQ ID Nps: 24^4100 
. and 8176-36681, and tfw reference nudeoticte sequenc- 
es thrpjugh the use of the computer program and iden- 
tifying pTA^ence^ between the nucleic acid codes and *5 
the, reference nucleotide sequences with ttie computer 
program. - : »: H ■ 1 v 
[0501] - ■:■ I n other, embc<fiments ttie computer based , 
system may further comprise an identifier for identifying 
features within the nucleotide sequences of the nucleic so 
acid codes of SEQ ID fsJOs: 24-4100 and 8178-36681 
or the amino acids^quertces of the polypeptide codes ; 
of SEQ ID NOs: 4101^177. ^ ,. s 
[0502] An "identifier", refers to one or more programs 
which identifi€^ certain features wfthin Jhe abpve-de- ss 
scrfced nucleotide sequences of the nucleic acid codes 
of SEQ ID NOs: 24-4100 and 8178-36681 or the afTwio 
acid sequer^ce^ of Uio IP , 
NOs; 4101^81 77^ln one embpdimenU the identifier may 
comprise a program whk^ identifres an 
frame in thecDHAs codes of SBQIDNOs: 24^100 and 

8178^36681- . f -, , : </ . 

[0503] In another .ernbodimerit, the identffier <may *, 
comprise a molecular modeling program which deter- 
mines ^ S^firrieristonal structure xrf the .polypeptides 
codes of SEQ ID^NQs: 4101-8177: In some embodi- 
ments, the molecular modeling program identifies target 
sequences that arP most compatible with profiles repre- ; . 
senting the structural environments of the residues in 
larK>wnOiree-dirrtenstcnal protein structures v <See, e.g., so 
Eisenberg et al.; U.S. Patent No: 5.436,850 issued Jury s 
25> 1 995). In another technique,* the known three-di- 
mensional structures of proteins in a given famBy are 
superimposed to define the structural conserved re- 
gions in that family. This protein modeling technique at- / ss 
so uses the known three^finrtensionaJ structure of a ho- 
mologous protein to approximate the structure of the 
polypeptide codes of SEQ ID NOs: 4101-8177. (See e. 



g., Srinivasan, et aL, U.S: Patent No. 5,557, W5 issued 
September 17, 1996)- Conventional hornplpoy mpde- 
fing techniques have been used routinely to build mod- 
els of proteases and antibodies. (Sowpftarntni et at. 
Protein Engineenng 10-^07, 215 (1997)). Comparative 
approaches can also be used to develop, three^men- 
sional protein models when the protein ,of interest has 
rxor sequence identity to temptate proteins. In some 
cases^p^etn^fc4dinto« 

tures despite having very weak sequer^ identities. For 
exarnple,^ numbjer :! " 

of hefical cytokines f old in simitar three^ 
poloo^ in spite of weak sequence homology... < 
[0504] , The recent development Crf tJireaolhg methods 
now enables the identification of Kketyfofcfing patterns 
ina number of situations where ^ str^uial retateo^ 
ness between target and template(s) is not detectable 
at the sequence levei Hybrid me^ 
ognitk« is pertonned 

ing (MST). structural equivalencies are oVxiuced from . 
the threading output using a Distance geometry program , , 
DRAGON to construct a low resolutbn nrodel, arid a full: < 
alcm representation is constructed using a molecular 
modeling package such as QUANTA/ ,.:-v vV . ; v ; . 
[0505] , According tp this 3-step approach, candidate :■ 
templates am first identified by using the novel fold rec- , < 
c^rtion algorithm MST, which is capable of performing 
simultaneous threacfing of mult^le aligned sequences 
onto one. oc more 3-D structures, In a second step, the 
structural equivalencies obtained from the MST puj^ut, 
are converted into interresidue distance restrakits and 
fecf into the distance geometry projg|iam : DF^GiON, ^ 
gether with auxiliary: information obtained frorn second- • 
ary structure prep^ticfis. The program combines me re- 
straints in an unbiased manner and rapidly generates a ■ 
large number <A fcw resolution rn^^ , 
a third step, these tow resolution mo^ 
are cpnyeit'ed ^tofulhatOT 

erp^ minJIm^ the nr^ecular mpoeling pack- 

age ^ QUANTA. (See e,^, Aszop^ et al,, Protein 

ture, ^Function, and Genetics, Supplement 1 :38-42 • 

(1997» , - • - : V ■ v' 1 ' ' >- X 

[0506] The results of the mdecuter rr^^ 
may then bp used in rational drug design ^ techniques to 
identify agents which nxKlulate the actKrrty^pfiJhe 
pcrfypeptide codes of SEQ ID NOs: 41Q1-8177. ^ ^ 
[C^y7| Accofb^igry, another aspect of the present kv 7 
vention is a method of identifytng a feature within the , 
nucleic acid codes of SEQ ID NCte: 24^100 
8178^36681 or the polypeptide codes of S^ ID NQs: ; 
4101-8177 cornpriskig rearing «he nucleic ac|d code(s) 
or the polypeptide code(s) trough the use of a compu- 
ter prop/am which identifies features tfterein and iden- 
tifying features wfthki^the nucleic, ackl code(s) or 
pofypeptkie ^ code(s) :With the cc^rputer prpg/am. In one 
en^bc^nr^t, cornputer program comprises a cornputer 
program which identifies open reading frames. Iri a tur r . 
ther embodiment; the computer program identifies 
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structural motifs in a polypeptide sequence. In another 
embodiment, the computer program comprises a mo- 
lecular modeling program. The method may be per- 
formed by reading a single sequence or at least 2. 5, 10, 
15, 20/25/- 30, or 50 of the nucleic acid codes of SEQ 
ID NOs: 24-4100 aWBI 78^36681 or the porypeptide 
codes of SEQ IDr^:*iM 

the; compter program arid identifying features within 
the nucleic acid codes or polypeptide codes with the 
computer program. " . - \ . . 
[0^j of SEQ ID NOs: 

24-4100 and 8178-36681 or the polypeptide codes of 
SEQ ID NOs: 4101 -8177 may be stored arid manipulat- 
ed in a variety of data processor programs in a variety 
of formats. For example, the nucleic acid codes of SEQ 
ID NOs: 24-4100 and 8178^36661 or the porypeplicle- 
codes of SEQ ID NOs: 4101^177 may be stored as text 
in a word processing file; such as MbrosoftVVORD or 
WOROPE RFECT or ad ah ASCI I file in a vanety of da? 
tabase programs farriatar to those of skill in the art, such 
as ^^E;ipr Or^^ 
pufer prograrns arid databases may be used as se- 
quence comparers, identifiers, or sources of reference 
nucleotide or polypeptide seo^e compared to 

the nucleic acid ooo^ of SEQ'ID *tas: 24^4100 and 
817^i3fe^ToV WpoVpe^ide codes of SEQ ID NOs: 
4101W77; The following list is intended not to limit the 
invention but to provide ^idance to programs and da- 
tabases which are useful with the nucleic acid codes of 
SEQ ID N^: 24-4100 and 8178-36681 or the rjetypejK 
ttde codes of SEQ lb NOs:- 4101^81 77. The prograrns : 
and databases which may be used Include, but are not 
limited to: -I^PattemlEMBL), Discovery Base (Molec- 
ular Applk^tions Group); GerieMtne (Molecular Appli- 
cations Group),' Look (Molecular Applications Group), 
MacLook^Moledul^r Applicafwr^ (S^p), BLAST and 
BLAST2 (NGBI), fiLASTNand BLASTX (Altschul et al, 
J. M& BbL 2i5: 403 (1990)), ^^FASTA(Pears<xiandl^ 
man, Proa NatL - Acad Set USA 85: 2444 0988)); 
FASTDB (Brutfeg et al. C*^yA^ Biosci 6:237-2*5, "i 
1990), Catalyst (Molecular SinwIaUohs IrK )^ 
SHAPE (Secular Simulate *lhc3, ? Gerfus^.bBAc-' 
cess tMoleoitar Simulations lrw;X 
Simulations Irici), Insight II;* (Molecular Simulations 
Inci); Discover (Molecular Simulations Inc.^ CtWIMrri 
(Molecular Simulations Inc.), Fefix (Molecular Simula-* 
tions Inc.), DelPhi. (Molecular Simulations Inc.), 
QuanteMM, (Molecular Simulations Inc.), Homology 
(Molecular Simulations Inc.), Modeler (Molecular Simu- 
lations Inc.), ISIS (Molecular Simulations Inc.), Quanta/ 
Protein Design (Molecular Simulate 
(Molecular Sirrujtartions Inc ), VVebtab Diversity Explor- 
er (Molecular Salutations Inc.), Gene Explorer (Molec- 
ular Simulations Inc.); SeqFoW (Molecular Simulations 
Inc.), the EMBL/Swissprotein database/the MDL Avail- 
able Chemicals Directory database, the MDL Drug Data 
Report data base; the Comprehensive Meolcirial Chem- 
istry database, Dervvents% World Drug Index database, 
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the BioByteMasterFiJe database, the Genbank data- 
base, ahdlhe Genseqn Database. Many otoerprograrre 
and data bases would be apparent to one of skill in the 
art given the present disclosure. 
[0509] Motifs which may be detected using the above 
programs include sequerkes encoding leucine zippers; 1 
heTiX4um^b(irib^ sites, ubtquittnation 

sites; sUptia befki^, and beta sheets, signal sequences ' 
encoding signal prides which direct the secretion of 
the encoded proteins, sequences implicated in'traftf 
scrfption regulation such as homeoboxes, - .acidic 
stretches, enzymatic active sites; substrate binding : 
sites, ancl enzymattb cleavage s&es: ■ ! • ■ : 

BXAMP^gt . V 

Methods of Makfriq Nucleic Acids - .-*r 5 , 

[0510J The preset riveht on al^ 
of rraking the EST^elat^ 

EST-related nucleic; acids, positional segments of me 
EST-re^ted nucfeSc acids, 1 or fragments erf posfticMTal 
segments of the EST-related nucleic acids ^Them^ 
ods comprtse ! seo;uerinally linking together nucleotides 
to produce the nucleic acids having the preceding se^ 
quences A variety of rrathods of synthesizing 'nucleic 
acids are known to those skilled in the artv - 1 ' = 
[0511] - In rhatty of trtes^ cbr^ v 
ducled'bn a sc*d support TOese' ihduo^ed the 3f! phbS- 
phoramidfte metruxis in which the 3* terminal base of 
the' desired otigohucleotide is immobilized on an insol- 
uble carrier: The nucleotide base to be added is blocked ' ■ 
at the 5' hydroxyl and activated at the 3* hydroxy I so as 
to cause coupling with the tmrrtobifized nucleotide base: ■ 
Deblocking of : tfie new immobilized nucleotide com- ' 
pound arid rep^tich of the c^ 
sired poryntw^eotideiM 

be prepared as described in ^U S I^tentNd. S.M^ 
In some ernberftrnents, several polynuclebtides pre- 
pared as described above are Cgated together to gen- 
erate tong er ^ having aVdestred : s&\ 

o^eilce^ / 1 >' l .-;^: 1 t ; .v:.:- •< ,".>,;.\-r 

EXAMPLE 62 ; .x*. ~ • ■ 

Methods of Making Polypeptides ^ , - 

[0512] Represent irrventionalsocomprises me 
of rnakihg the jx>rynudeotides encoded by EST-related 
nucleic ackfs, frap/r»ents of EST-related nucleic acids) ^ 
positional segments of the EST-related nucleic ackis; or 
fragments of positional segments of the EST-related nti? 
cfefc acids "and methods of making the EST-reta^ed - 
polypeptides, 1 fragments of EST-related polypeptides, 
positional segments of EST-related polypeptides; or 
fragments of EST-related polypeptides. : The. methods 
comprise sequentially linking together amino adds to 
produce the nucleic polypeptides having the preceoSrtg : 
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sequences, IrV spme 

made by these melhods are 150 amino acid or [ess in 
length. In other embodirnents, the polypeptides made 
by these methods are 1 20 amino acids or less in length , 
10$I3]-£wA variety o( methods of making polypeptides 
are known to those skflled in the art inctudffig methods 
in which the carboxyl terminal ammo acid is bound to 
polyvinyl .benzene or amrtr^ suitable resin. The amino 
acid to be addjed possesses blocking groups on its am* , , : 
no moiety and any side chain reactive groups so thai 4 
orily ifecarbo^ Thecafbo^^cw^^ 
is activated with carboo^imide or another activating 
agent and allowed to couple to the immobilized amino 
acid. After removal of the blocking group, the cyctejs 
repeated to generate a polypeptide having the desired 
sequence. Alternatively, the. methods described in ll.S^ 
Patent No. 5.049,656 may be used. 
[05141 ::tf .^.dfeqMs^aboye^ the EST-relaied nucleic 
ac!fei«rag^ 

tional segments Pt the ECT-related nucleic ackte, or 
fra9pent$ of positional segrnents of the EST-related nur : 
cteb acids can be used for various purposes; The potyr 
niK^eotk^ ^ recombinant protein 

forranaVsis, char^ therapeutic use; pro- 

duction of secreted, porypeptides or chimeric polypep- 
tides, antfcody production, as markers for tissues in 
which- the correspcfiding: protein is preferentially ex- 
pressed (either constitutiyely or aj a particular stage of : 
tissue dffiereritiation *x development ;or in ; dfeeasO ; 
stales); as molecular weight markers on Southern gels; 
as<^onr»somernarkere : 
tify chronrKJspmes -or^omap^re^ 
cc*npare\ with erwJpgenpus DNA sequences in patients 
to identify potential o^oetic disorders; as pr^^ 
brkiize and thus discover novel related DN A sequence 
es; as a source of inforrr^kjn to derive TORpri(^S:for ;. 
genetic fingerprinting; for selecting and making oligom- 
ers for attachment to a •genej^^' cff other support, in- 
dudtng for exan^ticryfor expression patterns; to ra ise 
anti-protein anti^^ 

niques; and as an antigen to raise anti-DNA anttoocOes 
or elicit another immune response. Where the polynu- 
cleotide encodes a protein or polypeptide y4tich bthds 
or potentially binds to another protein or polypeptide 
(such as, for example, in a receptw^gart^ interaction), 
the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris 
ef aL,s Ce8 75:791-80$ (1993)) to identify polynucle- 
otides encoding the other protein or ; |^peptide wftfi; 
which binding occurs ortoidentity i*ibitors of tru* bind- 
ing interaction, . * \. v -y : vi. 
{0515] The proteins or polypeptides provided by foe 
present invention can similarly be used in assays tode- v 
termme biological activity, including in a panel of multiple ; i 
proteins for higjvthrouo^put screening; to raise antibod- ss 
tes or to eOcit another immune response; as a reagent 
(including the labeled reagent) in assays designed to 
quantitatively determine levels of the protein (or its re- 
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ceptor) in brak^k^l fluids; as markers for tissues n 
which the correspondiijg protein is preferentially ex- 
pressed (either consUtutrvety or at a particular stage of 
tissue differentiation or cfevelor>ment or in a disease 
state); and, of course, to isolate correlative receptors or 
figands. Where the protein or polypeptide birjds or po- 
tentially binds to another; protein or polypeptide (such ; 
as. for example, in a receptcr^gar^.ir^e^^ 
protein can be used to identify the other protein wfth, 
which binding occurs or to identify inhfritors of the bind- ; 
ing bteractipn. Proteins or porypeptip^s involved in 
these binding interactions can also be; used to screen 
for peptide or small molecule Inhibtto^or agonists of ^ 
the b^ng interaction. ^u-.-.-- -A. 

[fJ516]^ -Any or aQ of these research utilities are capable 
of being developed into reagent grade or,kit ; format |or 
commercialization as research products. ; c , ; - ; ?r / ; 
(05171 JMp?**te for fjerforming the ; uses fisted a^oye> 
are wefl icrto^ in we art f^ereh^; 

cfis<*>singsuch rnethods indude 
lecular Cloning; A lateral 
Sprir^ Harbor Laboratory F^e^ .Sarj^^^ 
Fritsch and T. Maniatis eds. N 1989, and ^ethodsjnBv 
zymofogy; Guide to Molecular C^ing T^ 
ademtc Press; Berger.SJL^ 

m 518]^Potynu^ ; 
of the present invention can also be used as outrrtional 
sources or supplements. Such uses mcfiio^e ,wit|^t t pr^ - , 
itation use as a protein or amino acid sur^lerr^t^use 
as a carbon source, use as a nitrogen source ar«f use ; 
as a source^of cari^yoYate. In P 1 ^ 6 ^ 1 ^ 
or pp^ucjeotide of -the inyentkjniC^^^te 
feed ofa particular organism or can be aoVninistered as 
a separate solid or Dq^idpreparation, such as 
of powder, pills, sdutic^s, sus^^ 1f>r i 

the case of micrcKxpanisjrns, the prc^ein or p^lynucle- . 
otkJb of the nventipncaa be added to the mealum in or v 
on which ^ microprganism is cuttured. i 
[0519] Mhoug^i this invention has bjeen q^^bed in - 
terms of certairi preferred^ embcp^men^ other embpoV ; 
merits which will be apparent to those ojKordin^ 
in the art in yjew of the d^ are also w^in 

the scope of thfe^yehtion. Accordingly, the scope, of the; 
invention is in^nded to be defined wly by referer^ to 
the appended clatrns. . .. ; : ,< M % ; ,.; ■: ..Vyv-' 



Claims. .„:■/-- 1 { : ■'•;i,u.-'^-."V > • ; 

1. A purifled nudeic acjtf, 

leded from; the group consisting of SEQ 4D NOs; 
24-41QP and SEQ ID NOs: 8178^36681 : and se-. 
quences con^lerrierrtary to 
, ID NOs: 24-4100 and SEQ ID NQs: 8178-36681- 

2i A purified nucleic acid coriprising at leas| ? 10 oprtr : 
secutiye nucleptides of a s 
the group consisting of SEQ ID NOs: 24-4100 and 
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SEQ ID NOs: 8178-36681 and sequences comple- 
mentary to the setjuences of SEQ ID NOs: 24-4100 
and SEQ ID NOs: 817836681;- ^ 
i . ■ <*^; : ■, ' 

3. A purified nucleic comprising at least 1 5 con- s 
secutrve nucleotides of a sequence selected from 
nhe group consisting of SEQ ID NOs: 24-4100 and 
SEQ ID NOs: 8178-36681 and sequences comple- 
mentary to the sequences of SEQ ID NOs: 24-4100 
andSEQiDNOs^8i78^36681t ' .v> io 

4. A purified nucleic acid comprising the coding se- 
-' quence of a seo^ence^efected from the rgrbup'corv 1 

sisting of SEti ID NdsP24-4100i 

15 

5. A purified nucleic acid comprising the full coding se- 
quences of a sequence selectee! from the group 
consisting of SEQ ID NOs: 3721-3811 wherein the 
full coding sequence comprises the sequence en- 

: * coding the signal peptide and the sequence encod- 20 
* ing the mature protein, * = 

6. A purified nuclei acid comprising^ 

span of a sequence selected from the p^oup con- 
sisting of SEQ ID NOs: 3721-3811 whicIV encodes 2s 
the mature proteins f^-. - ^ 

7. A purified nucleic acid comprising a contiguous 
span oT a sequence selected trdm the group con- 
sisting of SEQ ID NOs: 24-652 and 3721-3811 30 
which encode the signal peptide. 

8. A purified nucleic acid encoding a polypeptide com- 
prising a sequence selected from the group constst- 

> ingof the sequences of SEQ ID-NQs: 4101-8177.' 35 

9. A purified nucleic acid encoding a polypeptide cohv 
> : prising a sequence selected from the group consist- 
ing of the sequences of SEQ ID NOs: 7798^7888. 

> iV-ik". ; y.. ; r • _ ./ ' AO 

1 0. A purified nucleic acid encoding a polypeptide com- 
? prising a mature protein included in a sequence se- 
lected from the group consisting of the sequences 

i of SEQ ID NOs: 7798-7888 - 

'■' \S\ 45 

11. A purified nucleic acid encoding a polypeptide com- 
prising a signal peptide included in a sequence se- 
lected from the group consisting of the sequences 
of SEQ ID NQs: 4101-4729 and 7798-7888J 

;;-«---tt - ^u-. . : *: ■;. vcvV-!? . SO 

12. A purified nucleic acid at least 15 nucleotides in 
length which hybridizes under stringent conditions 
to a sequence selected from the group consisting 

' of SEQ ID NOs: ; 24-4100- arid SEQ ID NOs: 
81 78-36681 and sequences complementary to the 55 
sequences of SEQ ID NOs: 24-4100 and SEQ ID v 
NOs: 8178.36681. ^ 



13. A purified or isolated polypeptide comprising a se- 
quence selected from the group consisting of the 
sequences of SEQ ID NOs: 4101^8177 

14; A purified** isolated polypeptide comprising a se- 
quence selected from the group consisting of SEQ 
ID NOs: 7798,7888. 

1 5. A purified or isolated polypeptide comprising a ma- 
ture protein ; of a polypeptide selected from the 
group consisting of SEQ ID NOs: 7798-7888. *' 

16: A purified or isolated polypeptide comprising a sip/ 
nal peptide of a sequence selected from the group 
consisting of the polypeptides of SEQ ID NOs: 
4101-4729 and 7798-7888. = 1 ^ 

17. A purified or isolated polypeptide comprising at 

> least 1 0 consecutive amino acids of a sequence se- 
lected from the group consisting of the sequences 
of SEQ ID NOs: 4101-8177. 

18. A method of making a cDNA comprising the steps 

of: y » 

contacting a collection of mRNA molecules 
from human i cells with- a primer comprising at 
least *T5 consecutive nucleotides of a sequence 

> •- selected from the p/cwp consisting ^ se- 

quertces 'complementaiy to SEQ ID NOs: 
24-4100 and SEQ ID NOs: 8178-36681; 
• hybridizing said primer to an mRNA in said col- 
lection that encodes said protein; o. 
reverse transcribing said hybridized primer to 
make a first cDN A strand from said mRNA; 
making a second cDNA strand complementary 
to said first cDNA strand; and 
isolating the resulting cDNA encoding said pro- > 
tein comprising said first cDN A strand and said 
second cDNA strand. 

19. A purified cDNA obtainable by the method of Claim 
18. 

20. The cDNA of Claim 1 9 wherein said cDNA encodes 
■ at least a portion of a human polypeptide. 

21. A method of making a cDNA comprising the steps 

obtaining a cDNA comprising a sequence se- 
lected from the group consisting of SEQ ID 
NOs:i24nt100 and SEQ ID NOs: 8178-36681; 
contacting said cDNA with a detectable probe 
* ■ : comprising at least 1 5 consecutive nucleotides 
of a sequence selected from the group.cprisist- 
irtg of SEQ ID NOs: 24^4100 and SEQ ID NOs: 
V 8178-36681 and the sequences compiernenta^ 
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ry to SEQ ID NOs: 24r41 00 and SEQ IDNps: 
8178-36681 urKtercoriditio^ 
probe to hybridize to said cDN A; 
identifying a cONA which hybridizes to said de- : 
\;i tectable probe; and .->- K ^ & 

isolating said cDNA which hybridizes to said 
probe. 

22. A purified cDNA obtainable by the method of Claim 

23. : The cONA of Claim ?2 wherein said cDNA encodes 

at least a portion of a human polypeptide. 

24. A method of making a cDNA comprising the steps « 
of: 

contacting a collectk)n of^ntfiNA ipofecutes 
c lrom human cells with a first primer capable of 
hybridizing to the potyA taitof saidmRNA; y, 
hybridizing said first primer to said potyA tail; 
reverse transcrbing said mRN A to make a first .-. 
cDNA strand; 

making a second cDN A strand complementary 
evM, ta said first cONA strand using at least one & 

■ ^pr^er cdmprising at least 15 consecutive nu- 
r.- h- -v. cleotides^ot a Ksequence> selected from the 
- ./group oon^istiri 

SEQ ID;NOs 8178-36681; and 
isolating the resulting cDN A comprising said 30 
r : . first; cDNA strand , and said second ■ cDNA 
strand. : - . 

25. A purified cONA obtainable by the method of Claim 

-24.^ ■; •-■*■:■-< ■ Ko--v- 35 

26. ; The cDNAof Claim 25 wherein said cDNA erMxxles 
b at feast a portion of a human polypeptide^ 

27. The method of Claim 24, wherein the second cDN A 40 
strand is made by. ^ -v^- V- 

contacting said first cONA strand with a first pair 
--oo. of primers, said first pair of pnmers comprising , 
a second primer comprising at least15 consec- ^ 
utive nucleotides of a sequence selected from 
i - the group consisting of SEQ ID NOs: 24-4100 
and SEQ ID NOs: 61 78-36681 and a third prim- 
er havng a sequence therein which is included 
within the sequence of said first primer; so 
: performing ia first polymerase chain reaction 
: \^wth said first pair of primers to generate a first 
? RCR product; *v ■ "■:-*=:* ■: 

contacting said first PCR product wdh a second 
• pair of primers, said second , pair of primers 55 
compriskig a fourth primer, said fourth primer 
como^ 

of said sequence selected from the group con- 



. sisting of SEQ ID NOs: ,24-41 00 and SEQ ID 
NOs: 8178-36681, and a fifth primer, wherein 
said fourth and fifth hybridize to sequences 
within said first ^CRiprpduct; and 
i performing a secorKi polymerase chain reac- 
tion, thereby generating a second PCR prod- 
uct . . i * .■/':*;;..."- 

28. A purified cDN A obtainable by the method of Claim 
27. 

29. The cDNA of Claim 28 wherein said cDNA encodes 
at least a portion of a human polypeptide. 

3a The method of Claim 24 wherein the second cONA 
strand is made by: * >•;>•. > 

contacting said first cDNA strand with a second 
. < .. primer comprising at least 15 consecutive nu- 
cleotides of a sequence selected from the 
group consisting of SEQ ID NOs: 24-4100 arid 
SEQ ID NOs: 81 78-36681 ; 
hybridizing said second primer to said first 
strand cDN A; and ; j 

extending said hybridized second : primer; to 
generate said second cDNA strand. 

31. A purified cDN A obtainable by the method of Claim 

.. : 30. . : • -v.. . •: I: :■ . vf:-.: 

32. The cDN A of Claim 28, wherein said cDN A encodes 
at least a portion of a human polypeptide. , 

33. A method of making ;a polypeptide comprising the 
steps of: 

obtaining a cDNA which encodes a jpolypeptide 
• encoded by a ^nucleic acid comprising a se- 
quence selected from the group consisting of 

; SEQ IO NOs: 24-4100 or ai cDNA which enf 
. coo^ a polypeptide a 

, ■ secutiveam^ 

by a sequence selected from the group consist- 
ing of SEQ ID NOs: 24-4100; 
inserting said cDNA ri; an .expression vector 
such that satd cDNA "is operably linked to a pro- 
moter; .,v .... .. . -ik-V 

introducing said expression vector into a host 
cell whereby said host cell produces the protein 

: . encoded by said cDNA;.and 
isolating said protein, * 

34. An isolated protein obtainable by the method of 
Claim 33. •■•«<•. .., . • :Z " ■ 

35. A method of obtaining a promoter DNA comprising 
the steps of: 
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obtaining genomic DNA located upstream of a 
nucleic acid comprising a sequence selected 
from the group consisting of SEQ ID NOs: 
24-41 00 and SEQ ID NOs: 81 78^36681 and the 
■■ sequences xxirnplementafy to 

SEQ ID NOs: 24^4100 and SEQ ID NOs: 
8178-36681;' 

screening said genomic DNA to identify a pro- 
moter capable of directing transcription initia- 
tioh; and 

isolating said DNA comprising said identified 
promoter. 

36. The method of Claim 35, wherein said obtaining 
step comprises waiting from genomic DNA com- 
prising a sequence selected from the group consist- 
ing of SEQ ID NOs: 24-4100 and SEQ ID NOs: 
81 78-36681 and the sequences complementary to 
SEQ ID NOs: 24-4100 and SEQ ID NOs: 
8178-36681. 

37. The method of Claim 36, wherein said screening 
step comprises inserting genomic DNA located up- 
stream of a sequence selected from the group con- 
sisting of SEQ ID NOs: 24-4100 and SEQ ID NOs: 
8178-36681 and the sequences complementary to 
SEQ ID NOs: 24-*100 and SEQ ID NOs: 
8178-36681 into a promoter reporter vector. 

38. The method of Claim 36, wherein said screening 
step comprises identifying motifs in genomic DNA 
located upstream of a sequence selected from the 
group consisting of SEQ ID NOs: 24-4100 and SEQ 
ID NOs: 8178-36681 and the sequences comple- 
mentary to SEQ ID NOs: 24-41 00 and SEQ ID NOs: 
8178^36681 which are transcription factor binding 
sites or transcription start sites. 

39. An Isolated promoter obtainable by the method of 
any one of Claims 34 to 38. 

40. In an array of discrete ESTs or fragments thereof of 
at least 15 nucleotides in length, the improvement 
comprising inclusion in said array of at least one se- 
quence selected from the group consisting of SEQ 
ID NOs: 24-4100 and SEQ ID NOs: 8178-36681, 
the sequences complementaiy to the sequences of 
SEQ ID NOs: 24-4100 and SEQ ID NOs: 
8178-36681 and fragments comprising at least 15 
consecutive nucleotides of said sequence. 

41. The array of Claim 40 including therein at Jeast two 
sequences selected from the group consisting of 
SEQ ID NOs: 24-4100 and SEQ ID NOs: 
8178-36681, the sequences complementary to the 
sequences of SEQ ID NOs: 24-4100 and SEQ ID 
NOs: 8178-36681, and fragments comprising at 
least 15 consecutive nucleotides of said sequenc- 



es. 

42. The array of Claim 40 including therein at least five 
sequences selected from the group consisting of 

s SEQ ID NOs: 24^4100 and SEQ* 'ID. NOs: 
8178-36681, the sequences complementary to the 
sequences of SEQ ID NOs: 24-4100 and SEQ ID 
NOs: 8178-36681 and fragments comprising at 
least 15 consecutive nucleotides of said sequenc 

43. Ah enriched population of recombinant nucleic ac r 
ids, said recombinant nucleic acids comprising an 
insert nucleic acid and a backbone nucleic acid, 

f $ wherein: at least 5% of sad insert nuctec acids in 
said population comprise a sequence selected from 
the group consisting of SEQ ID NOs: 244100 and 
SEQ ID NOs: 81 78-36681 and the sequences com- 
plementary to SEQ ID NOs: 24^4100 and SEQ ID 

20 NOs: 8 1 78-36681 . 

44i iA purified or isolated antibody capable of specifical- 
ly binding to a polypeptide comprising a sequence 
selected from the group consisting of SEQ ID NOs: 
2S 4101-8177. 

45. A purified or isolated antibody capable of specfficak 
'■■< ty binding to a polypeptide comprising at least 10 

consecutive amino acids of a sequence selected 
30 from the . group consisting of SEQ ID. NOs: 
4101-8177. <: 

46. An antibody composition capable f of selectively 
' binding to an epitope-contariing fragment of a 

3£ polypeptide comprising a contiguous span of at 
- least 8 amino acids of any of : SEQ ID NOs: 
4101-8177, wherein said antibody is polyclonal or 
monoclonal. 

<o 47. A computer readable medium having stored there- 
on a sequence selected from the group consisting 
of a nucleic acid code of SEQ ID NOs: 24-4100 and ■ 
8178-36681 and a polypeptide code of SEQ ID 
NQs:<4101-8177. >v f , o y 

48. A computer system comprising a processor and a 
data storage device wherein said data storage de- 

: vice has stored thereon a sequence selected from 
the group consisting of a nucleic acid code of SE- 
so ; CMD NOs: 24-4100 and 81 78-36681 anda potypepr 
tide code of SEQ ID NOs: 4101-8177. 

49. The computer system of Claim 1 48 further compris- 
ing a sequence comparer and a data storage device 

ss having reference sequences stored thereon. 

50. The computer system of Claim 49 wherein said sev 
. quence comparer comprises a computer program 
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which indicates porymorphisms 

51 > Jhe computer system of Claim 48 further compris- 
: ling ah identifier whk^ identifies features in sakl se- 

oi'^quence.-.. • ' . . ■■ : 5 

52- A method for comparing a first sequence to a refer- 
ence sequence wherein said first sequence is se- 
lected from the group consisting of a nucleic acid 
codeof jSECHDNOs:2^4100and8178-36681 and to 
a polypeptide code of SEQ ID NOs: 4101 -81 77 
comprising the steps of: 

r reading said first sequence and said reference 
\ i. sequence through use of a xornputer program is 
which compares sequences; and 
: ^determining differences between said first se- 
quence and said reference sequence ;with said 
computer program. - 

20 

53^ The method of Claim 52, wherein said step of de- 
termining differences between the first sequence 
and the reference sequence comprises identifying 
pc^yrnorphisms. 

25 

54- A method for identifying a feature in a sequence se- 
lected from; the group consisting of a nucleic acid 
code of SECMD NOs: 2444100 and 8178-36681 and 
a polypeptide code of SEQ ID NOs: 4101-8177 
comprising the steps of: 30 

. reading said .sequence through the use of a --; 
computer program which identifies features in 
sequences; and ^ . v 
kfentifying features in said sequence with said 
computer program: 

55. A vector comprising a nucleic acid according to any 
one of Claims 1 to 12, ^ ■ - : 

S6l Atost ceil containing a nucleic ackJ of Claim 55; 

57. A method of making a nudefc acid c^^^ 
prising the steps of: 

introducing said rusclefc acW into^ 
i such that said nucleic acid is present in multiple 
<x>piesineachh^ 

isolating said nucleic ackl from said hbstcen. 

.. ... ,: s v , SO 

S8l A method of making a nucleic acid of any one of 
Claims 1 to 12 compris"«g the step of sequentially 
finking together the nucleotides tn said nucleic ac- 
ids. • '., 

55 

59. A method of making a polypeptide of any one of 
Claims 13 to 17 wherein said polypeptides is 150 
amino acids in length or less comprising the step of 



sequentially inking together the amino acids in said 
polypeptides. 

60. A method of making a polypeptide pf any one of 
Claims 13 to 17.wherein said polypeptides is 120 
amino acids in length or less comprising the step of 
sequentially finking together the amino acids in said 
polypeptides. 
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Minimum 
... signal 
peptide score 


false positive 
rate 


false 
negative rate 


proba(0.1) 


proba(0.2) 


3.5 


0,121 


0.036 


0,467 


0.664 


4 


0.096 


0,06 


0.519 


0.708 


4.5 


0.078 


0.079 


0.565 


6;745 


5 


0.062 


0.098 


0.615 


' 0.782 


5.5 


0.05 


0,127 


0.659 


0,813 


6 


0.04 


0.163 


0.694 


0.836 


6.5 


0.033 


0.202 


0.725 


0.855 


•«<*■•■*.■.<■; 7 


0.025 


0.248 


0.763 


0;67B 


7<5 


0.021 


0.304 


0.78 


•0.889 


8 


0.015 


0.368 


0.816 


0.909 


8.5 


0.012 


0.418 


0,836 


0.92 


9 


0.009 


0.512 


0.856 


0.93 


9.5 


0.007 


0.581 


> 0.863 


0.934 


10 


0.006 


0.679 


0.835 


0.919 



. - ■■-■■5-*. • * ■•--■*} ^ rB 
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Description .of Transcription Factor Binding Sites present on promoters isolated from 
StgnaTTag sentences 



Promoter sequence P13H2 (546 bp): 






Matrix 


P0$f tlOB 


Orientation 


Score 


CMYB 01 


-502 




0.983 


MYOD Q6 


-501 




0.961 


S8 01 


-444 




0.960 


S8"01 


-425 


+ 


0.966 


DELTAEF! 01 


-390 




0960 


GATA C 


-364 




0.964 


CMYB 01 


-349 




0.958 


GATAF 02 


-343 




0.959 


GATA C 


-339 




0.953 


TAL1AJLPHAE47 01 


-235 




0.973 


TALIBETAE47 01 


-235 


+ 


0.983 


TALIBETAJTF2 01 


-235 




0-978 


MYOD Q6 


-232 




0.954 


GATAf 04 


-217 




0.953 


DCI 01 


-126 


+ 


0.963 


DC2*01 


-126 


+ 


0585 


CREL 01 


-123 


+ 


0.962 


GATA1 02 


-96 


+ 


0.950 


SRY 02 


^11 




0.951 


E2F 02 


-33 


+ 


0.957 


MZF1 01 


-5 




0.975 



Length 


Sequence 


Location in: 




SEQIDNO: 17 


9 


TGTCAGTTG 


1725 


to 


CCCAACTGAC 


complement of 18-27 


11 


AATAGAATTAG 


complement of 75-85 


11 


AACTAAATTAG 


94-104 


11 


GCACACCTCAG 


complement of 129-139 


11 


AGATAAATCCA 


complement of 155-16S 


9 


CTTCAGTTG 


170-178 


14 


TTGTAGATAGGACA 


176-189 


11 


AGATAGGACAT 


180-190 


16 


CATAACAGATGGTAAG 


2S4-299 


16 


CATAACAGATGGTAAG 


284-299 


16 


CATAACAGATGGTAAG 


284-299 


10 


AOCATCTGTT 


complement of 287-296 


13 


TCAAGATAAAGTA 


complement of 302-314 


13 


AGTTGGGAATTCC 


393-405 


12 


AGTTGGGAATTC 


393-104 


10 


TGGGAATTCC 


396-405 


14 


TCAGTGATATGGCA 


423-136 


12 


TAAAACAAAACA 


complement of 478-489 


8 


TTTAGOGC 


486-493 


8 


TGAGGGGA 


complement of 5 14-52 1 



Promoter sequence P15B4 (861 bp): 










Location fn: 


Matrix 


Position Orientation 


Score 


Length 


Sequence ° 














SEQIDNO: 20 


NFY Q6 


-74S 




0.956 


11 


CGACCAATCAT 


complement of 60-70 


MZFt 01 


-738 


+ 


0.962 


8 


OCTGGGGA 


70-77 


CMYB 01 


-684 


+ 


0594 


9 


TGACOGTTG 


124-132 


VMYB 02 


-682 




0585 


9 


TCCAACGGT 


complement of 1 26- 1 34 


STAT 01 


-673 


+ 


0568 


9 


TTCCTGGAA 


135-143 


STAT 01 


-673 




0.951 


9 


TTCCAGGAA 


complement of 135-143 


MZF1 01 


-556 




0556 


8 


TTGGGGGA 


complement of 252-259 


IK2 01 


-451 


+ 


0.965 


12 


GAATGGGATTTC 


357-368 


MZFI 01 


-424 


+ 


0586 


8 


AGAGGGGA 


384-391 


SRY 02 


-398 




0555 


12 


GAAAACAAAACA 


complement of 4 1 0-42 1 


MZFI 01 


-216 


+ 


0.960 


8 


GAAGGGGA 


592-599 


MYOD Q6 


-190 


+ 


0581 


10 


AGCATCTGOC 


618-627 


DELTAEF I 01 


-176 


+ 


0.958 


II 


ToccACcrrcc 


632-442 


S8 0I 


5 




0592 


11 


GAGGCAATTAT 


complement of 8 13-823 


MZFI 01 


16 




0586 


8 


AGAGGGGA 


complement of 824-83 1 



Matrix 


Position 


Orientation 


Score 


Length 


' Sequence 


Location In: 










SEQIDNO: 23 


ARNT 01 


-311 




0.964 


16 


GGACTCAOGTGCTGCT 


191-206 


NMYC 01 


-309 




0.965 


12 


ACTCACGTGCTG 


I93r204 


USF0I 


-309 


+ 


0585 


12 


ACTCACGTGCTG 


193-204 


USF01 


-309 




0585 


12 


CAGCAOGTGAGT 


cornplcnieiit of 193-204 


NMYC 01 


-309 




0.956 


12 


CAGCACGTGAGT 


complement of 193-204 


MYCMAX 02 


-309 




0572 


12 


CAGCACGTGAGT 


amazement of 193-204 


USF C 


-307 




0.997 


8 


TCAOGTGC 


195-202 


USFC 


-307 




0.991 


8 


GCAOGTGA 


complement of 1 95-202 


MZFI 01 


-292 




0.968 


8 


CATGGGGA 


complement of2 10-2 17 


ELKl"02 


-105 


+ 


0.963 


14 


CTCTC0GGAAGOCT 


397-410 


CETSTP54 01 


-102 


+ 


0574 


10 


TCOGGAAGOC 


400-409 


APLQ4 


-42 




0.963 


11 


AGTGACTGAAC 


complement of 460-470 


APIFJ Q2 


-42 




0.961 


It 


AGTCACTCAAC 


complement of 460-470 


PADS C 


45 




1.000 


9 


.TGTOGTCTC 


547^555 
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